Archive for the ‘Programming’ Category
Friday, October 24th, 2008
Here is a short way to cleanup bad HTML input and convert to XML with Perl:
use HTML::TreeBuilder;
use XML::LibXML;
$html_code = '';
my $builder = HTML::TreeBuilder->new();
$xml_source = $builder->parse($html_code);
$xml_source->elementify();
$xml_source1 = $xml_source->as_XML();
my $parser = XML::LibXML->new();
$parser->recover(1);
my $doc = $parser->parse_string($xml_source1);
$xml_source2 = $doc->toString();
Posted in Programming | 2 Comments »
Monday, October 20th, 2008
While I was working recently on one of my projects, I noticed a curious problem. The server I was using was running out of memory while doing a simple XSLT transform. That was sort of strange because the XSLT transform in question was rather simple and the amount of memory ...
Posted in Programming | No Comments »
Monday, October 20th, 2008
During my investigations into Google Reader and iGoogle, I ran into an issue which has not been clearly addressed anywhere. The question is if a site provides a JSON feed without a callback function and you are using it on a different domain (meaning you cannot use XmlHttpRequest), can you ...
Posted in Programming | No Comments »
Sunday, October 19th, 2008
For the past few weeks, my RSS reader (Bloglines) has not been behaving. Now comes a post on Techncrunch that the founder of Bloglines is considering switching to Google Reader. I started exploring Google Reader to see if it would fit my needs (notice the new feed on the sidebar). ...
Posted in Programming, Projects | 1 Comment »
Thursday, February 28th, 2008
Installing server side components on shared hosting is always a challenge. In the last few weeks as I have begun to undertake more web based consulting assignments, I have found myself facing the need for source code management as well as project management. At my old startup, we use Subversion ...
Posted in Linux, Programming, Website | 6 Comments »
Thursday, November 22nd, 2007
This is an interesting problem that my wife had at work recently. In a VBA-based program, the Left function suddenly stopped working with an error along the lines of "type data mismatch". Being that this is a native function to VBA, my first thoughts were that it was caused by ...
Posted in Programming | No Comments »
Friday, November 9th, 2007
An interesting issue has come up recently with my publishing company - one of our printing suppliers flagged incoming PDFs as being not-printable due to transparencies. After looking around for solutions, I came up with a way to resolve the issue without resorting to Acrobat (which we don't use). The solution is two fold:
1. First convert the incoming PDF to PostScript using XPDF's pdftops. This will flatten the transparencies. GhostScript's pdf2ps tool DOES NOT do that.
2. Then convert the PostScript back to PDF using GhostScript's ps2pdf tool.
Both tools are open source and free (although watch out for GhostScript's GPL license). One important point - pdftops requires a paper width and ...
Posted in Programming | 1 Comment »
Monday, July 30th, 2007
About two years ago I coded a small experimental search engine for books which used Ajax and Amazon web services. Recently, I went back to the same concept and put up a new experiment - a meta search engine for book information that aggregates book data from about 60 different ...
Posted in Programming, Projects | No Comments »
Wednesday, July 25th, 2007
One of the more mundane tasks that faces every publishing business like mine is data conversion. Recently, I have been involved in a major project which seeks to make available several hundred titles in print on demand format. Unfortunatly, the library that scanned these titles did not use PDF ...
Posted in Programming | No Comments »
Tuesday, July 10th, 2007
For a while I have been working on a hobby project trying to make a meta-search engine that you can use to search multiple search engines by tag. The catch? No server side components. This search engine works client side only from the user's browser by using RSS feeds from ...
Posted in Programming, Projects | No Comments »