thefoundationhttp://www.thefoundation.de2011-03-01T17:25:05Z(c) 2012 Michael Kurze, Aachen, GermanyScalable Text Clustering 2011-03-01T17:25:05ZMichael Kurzehttp://www.thefoundation.de/about/michaelscalable-text-clustering<p>The updated version of the Grouperfish clustering plans, now published on the Mozilla <a href="http://blog.mozilla.com/data/2011/03/08/scalable-text-clustering-for-the-web/" title="Blog of Data - Scalable Text Clustering for the Web">Blog of Data</a>.</p><h3>The Background</h3>
<p>During my work as <a href="http://blog.mozilla.com/metrics/">metrics</a> liaison with the <a title="Firefox Input" href="http://input.mozilla.com/">Firefox Input</a> team, an exciting <a title="Bug 629019: Cluster themes and sites as publicly available and output to JSON" href="https://bugzilla.mozilla.org/show_bug.cgi?id=629019">requirement</a> has come up: scalable online clustering of the millions of feedback items that the users of Firefox share with us.</p>
<p>When designing a service at the metrics team, besides functional requirements (<em>accept text messages, produce clusters</em>) we consider scalability and durability. In fact, scalability concerns play a major role in wanting to replace the <a title="Dave Dash’s textcluster" href="https://github.com/davedash/textcluster">current solution</a> (which has done a fine job so far) and not picking another powerful <a title="Carrot2 clustering framework" href="http://project.carrot2.org/">existing tool</a>: We expect the influx of messages (already heading towards 2 million) to increase up to 50x once Firefox 4 is released.</p>
<h3>On to Architecture</h3>
<p>There is a <a title="Grouperfish architecture" href="https://github.com/michaelku/grouperfish/blob/master/doc/medium_sized_picture.pdf?raw=true">slide</a> outlining what the system (called <a title="Grouperfish on github" href="https://github.com/michaelku/grouperfish">Grouperfish</a>) is planned to look like. As this service is to be developed quickly and in iterations, even major parts of the system might be replaced in the future though. <em>This</em> is the rationale for our first version, to be released sometime around the Firefox 4 release:</p>
<h4>Concurrency</h4>
<p><em>We want to be able to handle tens of thousands of GET’s and thousands of POST’s per second, provided we have enough commodity hardware at our disposal.</em></p>
<p>To accept incoming documents and queue them for clustering, <a title="Node.JS Website" href="http://nodejs.org/">Node.JS</a> fits the bill. Its event-based concurrency model dominates thread- and process-based designs in IO-bound tasks such as this. Also, depending on the storage you pick, requests might pause to wait on garbage collection or to rewrite store files. Node can handle a lot of waiting requests because it does not use system level threads (or even processes) for concurrency.</p>
<h4>Storage</h4>
<p><em>Grouperfish must store millions of documents in hundreds of thousands of collections. The generated clusters may reference thousands of documents each, each ranging from a few bytes to about a megabyte. Also, we want to store processing data for clustering.</em></p>
<p>When planning for more data than fits into your collective RAM, you usually have two options (SQL not being one of them since RAM has become pretty big):</p>
<p><a title="Wikipedia: Amazon Dynamo" href="http://en.wikipedia.org/wiki/Dynamo_%28storage_system%29">Dynamo</a>-style key/value stores like <a title="Basho Riak" href="http://www.basho.com/riak">Riak</a> and <a title="Apache Cassandra" href="http://cassandra.apache.org/">Cassandra</a> allow to store replicated values with high write rates, and also to quickly retrieve individual items from disk. You do not need to worry about one machine getting too much attention (e.g. when one of your services gets slashdotted), thanks to consistent hashing. Riak even has a notion of <em>buckets, keys</em> and <em>values</em>: We would intuitively use buckets for collections of documents (and of clusters), and values for individual documents (and clusters). No wonder we looked at this more closely. Unfortunately though, Riak’s buckets are more of a namespacing device than anything else. It is expensive to get all elements of a bucket, since they are neither indexed by a common key nor stored together on disk. The Riak design can be a bit <a title="Getting all the keys" href="http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-January/002947.html">misleading</a> in this regard, as buckets are in fact <a title="Layout of Riak buckets" href="http://en.wikipedia.org/wiki/There%27s_a_Hole_in_My_Bucket">spread</a> throughout the key space. To retrieve all keys in a bucket, Riak will check every single key — possibly scanning gigabytes of main memory (for the very recent <a title="Riak search for Riak values" href="http://wiki.basho.com/Riak-Search---Indexing-and-Querying-Riak-KV-Data.html">Riak search</a> to help, you’d need to blow up your values quite a bit). And you still only have the keys. To get possibly millions of associated values, you need to move your little disk heads a lot. This is not always as bad as it sounds because Riak gives you streaming access to the data as it comes in. But in general, the smaller your buckets in relation to the entire key space, the higher the cost of retrieving many of them.</p>
<p>The other major contender are column-oriented data stores of the <a title="The Google BigTable paper" href="http://labs.google.com/papers/bigtable.html" target="_blank">BigTable</a> family, the most prominent of which is <a title="Apache HBase site" href="http://hbase.apache.org/">Apache HBase</a> (the aforementioned Cassandra is actually somewhat in-between, having properties from both worlds). The two main differences for users of HBase vs. Dynamo style stores as far as we are concerned: <em>1. Data is stored per column family:</em> to retrieve the vector representations of a million documents, we do not have to scan through a million document texts. <em>2. Records are sorted</em> <em>by key</em>, much like in a traditional database (but optimized for fast inserts, using <a title="Log Structured Merge Trees" href="http://nosqlsummer.org/paper/lsm-tree">LSM trees</a>). This is a blessing and a curse. A blessing, because we can scan over contiguous collections of documents. A curse, because we are vulnerable to <em>hotspotting</em> on popular collections. To counter this, we need to make sure that there are random parts in our row keys, e.g. using UUID’s. Because HBase divides tables into regions as they grow and hands them off to other nodes, this method avoids hotspots. And we do not lose the streaming advantage as long as we use common prefixes per collection.</p>
<p>Given our access patterns (insert documents, update clusters, re-process entire collections, fetch lists of clusters), efficient sequential access to selected parts of the data is very important. Sorted, column oriented storage seems to be the way to go. There are other pros and cons (single point of failure, write throughput, hardware requirements), but if we don’t cater to our use case, those won’t ever matter.</p>
<h4>Clustering</h4>
<p><em> Grouperfish must be able to handle small numbers of large corpora (millions of documents), as well as large numbers of small corpora (millions of collections). The generated clusters may contain thousands of messages each.</em></p>
<p>This is practically a no-brainer: Apache Mahout supports in-memory operation (for smaller clusters) as well as distributed clustering (using Apache hadoop, for larger clusters). Mahout can update existing clusters with new documents and generate labels for our clusters. Of course, Mahout is a java-library, so we need to run it within a JVM. To simplify management and introspection, we will run our clustering workers in jetty web containers.</p>
<h4>Scheduling</h4>
<p><em> We need to be able to add workers to increase clustering frequency. When there are more new messages than can be clustered right away, we want them to be queued. Also, we have Node.JS and we have Java/Mahout. We want our queue to bridge the gap.</em></p>
<p>Messaging has become a big topic as systems have become larger and more distributed. We want to use messages to decouple write requests from processing them. There is a very elegant solution to maintain queues, offered by the in-memory data store <a title="Redis" href="http://www.redis.io/">Redis</a>. Redis is somewhat like a developers dream of shared memory. No encoding and decoding of lists, maps and values as they enter and leave the stores — just operate on your data structures within shared memory. Unfortunately, Redis queues are really just a linked list with a blocking POP operation. While that is very nice, we want to track and resubmit failed tasks when a worker node falls victim to <a title="Not exactly rodents" href="http://theoatmeal.com/blog/fail_whale">rampaging rodents</a>.</p>
<p>The considerations of choosing <a title="RabbitMQ" href="http://www.rabbitmq.com/">RabbitMQ</a> to realize a task queue are worth an article of their own. Suffice to say, it has Node- and Java-bindings, and it supports message Acknowledgement from workers. We still want to use Redis to keep track of collection size, to cache the actual incoming data (no need to ask hbase if we use it right away), and for locking of colections, so that every collection is only modified by one worker at a time. We also <em>might </em>use it to cache frequently requested clusters.</p>
<h3>More Thoughts</h3>
<p>Selecting these components, I learned that it is important to choose technologies in an unbiased fashion, and to reconsider decisions when a technology has no answer for your requirement. For example, I originally wanted to use just Riak for storage — I like its simplicity and style, and the bucket metaphor — but the enumeration of large buckets would be too slow for an online system. It might be fine for a batch-only system, or a system that just does not operate on collections of varying size as much.</p>
<p>For a Message queue, <a title="zeromq" href="http://www.zeromq.org/">ØMQ</a> sounded awesome, offering low latency and powerful constructs, but I quickly realized that it is not really what I understand a <a title="Wikipedia on Message Queues" href="http://en.wikipedia.org/wiki/Message_queue">message queue</a> to be, but rather a very smart abstraction over traditional sockets. Probably someone will eventually build a distributed task queue on top of it though.</p>
Diaspora — Can the Social Graph be Our Web of Trust?2010-08-23T01:30:22ZMichael Kurzehttp://www.thefoundation.de/about/michaeldiaspora-and-the-web-of-trust<p>On Friday we had Max, Ilya and Raphael from <a href="http://www.joindiaspora.com" title="Diaspora Project Site">Diaspora</a> over at Mozilla. They <a href="http://tieguy.org/blog/2010/08/20/notes-on-diaspora-talk/" title="Luis Villa’s Notes on the Diaspora Talk">talked</a> about their effort in creating a distributed social network. Where I think they are on the right track, and where they should think even bigger.</p><h3>Why we need Diaspora</h3>
<p>
Personally, I see three major challenges that everyone passionate about the <a href="http://www.mozilla.org/about/manifesto.en.html" title="Principles of the Open Web, as outlined by the Mozilla Manifesto">open internet</a> needs to make up their mind about:
</p>
<ul style="margin-bottom:0.5em; margin-top:0.3em; padding-top: 0;">
<li><em>The <a href="http://googlepublicpolicy.blogspot.com/2010/08/joint-policy-proposal-for-open-internet.html" title="Google Public Policy on the Verizon deal">erosion</a> of <a href="http://dig.csail.mit.edu/2006/06/neutralnet.html" title="Daniel Weitzner: The neutral internet">Net Neutrality</a></em></li>
<li><em>Participants <a href="http://futureoftheinternet.org/" title="The Future of the Internet and How to Stop it by Jonathan Zittrain">switching to closed</a> environments of apps and appliances, becoming mere consumers (*)</em> </li>
<li><em>People entrusting their personal data and social activity to Facebook, forced to <a href="http://www.geekymomblog.com/2010/05/18/the-facebook-dilemma/" title="Geeky Mom on the Facebook dilemma">choose</a> between control and connectedness</em></li>
</ul>
<p>In the context of the Diaspora talk, I’ll focus on the third issue.</p>
<p>We need Diaspora because people need to be in control over with whom they share personal information. Every time Facebook <a href="http://www.aclunc.org/issues/technology/blog/facebook_places_check_this_out_before_you_check_in.shtml" title="http://arstechnica.com/web/news/2010/08/privacy-groups-facebook-already-facing-off-over-places.ars">sneaks in</a> a new default that breaks privacy, we grudgingly change the settings again — and stay, not wanting to lose our friends. Or we just don’t know about it and leave it as it is. Combined with the social monopoly that Facebook has established, this makes privacy and security optional features, subject to change like any other.</p>
<h3>How Diaspora can help already</h3>
<p>
The main distinguishing factor of Diaspora compared to Facebook et al. is in that it decouples your social graph from the network provider, bringing back real competition to the social space. Like with E-Mail, there can be lots of network providers, loosely connected over push-interfaces. Whenever a pod (the equivalent to an e-mail-provider in Diaspora) should violate your trust, you can just switch to another one, or set up your own pod.
</p>
<h3>What could be done better</h3>
<p>
On the downside, this means that you have to trust your pod as well as all your friend’s pods. <em>No big deal?</em> Well, where the same server software is used on a distributed network, it is very prone to exploit of <a href="http://en.wikipedia.org/wiki/Sendmail#History_of_vulnerabilities" title="History of Vulnerabilities in the popular mail server sendmail">vulnerabilities</a> due to patch delay and misconfiguration (correctly setting up <abbr title="Transport Layer Security">TLS</abbr> is still a big challenge, <a href="http://www.theinquirer.net/inquirer/news/1727426/us-government-fails-secure-websites" title="The Inquirer: DHS fails to secure its website">not only</a> for regular people).
</p>
<p>
<a href="http://en.wikipedia.org/wiki/HTTP_Secure" title="Wikipedia on HTTPS">Secure HTTP</a> is great when a large, anonymous group of people needs to trust a central service. It allows us to do online banking and purchases, free from eavesdropping and man-in-the-middle attacks. However, it is not peer-to-peer: When you fetch your mail over a secure IMAP connection, you might be sure that your password is protected, but you do not know who actually sent you that e-mail (think about it: that is the reason why phishing works). When you get it from Google Mail, you might be using TLS, but Google is still able to read your every conversation.
</p>
<h3>How PGP can solve this</h3>
<p>
I propose that Diaspora pods should be dumb post boxes that <em>are not able</em> to actually look into status updates, private messages, friend lists and so on. <em>How?</em> The technology for that has been available for quite some time and is called <a href="http://www.pgpi.org/doc/pgpintro/" title="Introduction to PGP">PGP</a>.
</p>
<p>
Basically, PGP allows you to send and receive messages that cannot be decrypted by the servers that route them. So, if you were to encrypt your message inside your browser, you would establish secure end-to-end communication. No need to trust the shady pods that some of your friends decided to use, not knowing any better. <em>But encryption in a web client? That sounds awfully slow!</em> Well, <a href="https://addons.mozilla.org/z/en-US/firefox/addon/10868/privacy/" title="Firefox Sync (aka Weave)">Firefox Sync</a> does it already with your entire browsing history (the pass phrase to your key is never sent to the server), and I would imagine that JavaScript interpreters have become fast enough to emulate the cryptographic capabilities of a PC from 1991.
</p>
<p>I do have ideas on how to approach search and incremental profile updates with this, and on the new security considerations that apply here (Can you always trust your browser? Could a pod not make you use an insecure web client that transmits your passphrase?). However, that is rather technical, possibly material for a follow up post.
</p>
<h3>The social network is a key signing party</h3>
<p>
The problem with PGP has always been that people have been unable to exchange public keys in a manner that is both trustworthy and extensive. Because a <a href="http://en.wikipedia.org/wiki/Web_of_trust" title="Wikipedia on the Web of Trust">web of trust</a> can often not be established, people refrain from using encrypted e-mail. Turns out that social networks come with a mechanism that is just made for this: <em>Friending</em>. In the secure social network, accepting a friend request would be equivalent to exchanging keys. Usually you are referred to friends from people you already know, so there already is a basic level of trust.
</p>
<p>
This means that online social networks can be transformed from a jeopardy to our security to a vehicle of the same. This idea is of course also <a href="http://serendipity.ruwenzori.net/index.php/2009/03/18/pgp-web-of-trust-meets-modern-social-networking" title="PGP web of trust meets modern social networking by Jean-Marc Liotier">not entirely new</a>. What might be new is the idea of building the social web entirely on top of PGP rather than just integrating that as an optional feature.
</p>
<h3>Any Comments?</h3>
<p>I have not gotten around to add Commenting or Pingback to this blog, but I would love to incorporate any (links to) comments in a follow up post, please write to michael at this domain.</p>
<h3>Update:</h3>
<p>
If I understand correctly, the diaspora guys are already planning to use GPG for cryptography <a href="http://www.joindiaspora.com/2010/04/21/a-little-more-about-the-project.html" title="Diaspora Blog: A little more about the project">somewhere</a>. This is a pretty good start. If they really already plan on generating keys for everyone, then they would only need to pull the actual encryption into the web client.
</p>
<p style="font-size: 85%;"><em>(*) Like any intern at Mozilla I had the opportunity to to talk to John Lilly, and I got the impression that Mozilla takes this development very seriously.</em></p>Sites for Mozilla Input2010-08-10T18:46:36ZMichael Kurzehttp://www.thefoundation.de/about/michaelsites-mozilla-input<p>As a side project during my internship at Mozilla, I <a href="http://aakash.doesthings.com/2010/08/10/firefox-input-1-6-2-is-released-more-malory/">worked with Aakash</a> from Mozilla QA to bring <a href="http://input.mozilla.com/sites" title="Input Dashboard: Sites">a new feature</a> to the Mozilla Input website.</p><p>
Oftentimes when users have trouble with a Firefox beta, there is not actually a bug in the beta, but a problem with a specific website (such as broken <a href="http://www.anybrowser.org/campaign/" title="Good old anybrowser website, unfortunately still an issue">user agent detection</a>). Even when a problem is related to Firefox, it can be very helpful for QA to see what sites trouble our users the most, and what issues the users face there.
</p>
<h3>Enter clustering…</h3>
<p>
To group sentiment by topic, my fellow metrics intern Andres and I made use of Dave Dash’s <a href="http://github.com/davedash/textcluster" title="Textcluster on github">clustering algorithm</a>, which uses techniques from the search engine world to group related input. That helps to get a quick impression on what’s going on when a site is causing trouble for many users. We also get a lot of positive feedback on sites where the user experience has improved for beta users compared to the release version.
</p>
<h3>…and Django of course!</h3>
<p>
It was very cool to do something with Django again. The webdev team is very knowledgeable in this area so I learned a lot working with <a href="http://fredericiana.com/" title="Fred Wenzel’s blog">Fred</a> and <a href="http://davedash.com/" title="Dave Dash’s site">Dave</a>. There are some limitations (you <em>still</em> <a href="http://blog.affien.com/archives/2009/05/30/django-annoyances-no-reverse-select_related/" title="Django annoyances — no reverse select related">cannot prefetch related objects</a> along the inverse edge of a one-to-many relationship, like with any sensible ORM), but other than that Django has become a pretty solid toolkit. Also I finally got started with Git, which is as of now my version control system of choice.
</p>
<p>
Hopefully my main project will allow me time to improve Input and the dashboard in the future, there’s a lot of cool stuff planned with it.
</p>Going to Mozilla2010-06-18T17:07:33ZMichael Kurzehttp://www.thefoundation.de/about/michaelgoing-mozilla<p>Starting on Monday, June 21 I am going to intern at the <a href="http://www.mozilla.com" title="mozilla.com">Mozilla Corporation</a> (MoCo) in Mountain View, California. Yay!</p><p>
For quite some time I have been following the <a href="http://mozillazine.org" title="mozillaZine">mozine</a> and later <a href="http://planet.mozilla.org" title="Planet Mozilla">PMO</a>.
So I am absolutely thrilled to have this opportunity, and this also means that this blog will get a new
<a href="/michael/on/mozilla" title="Articles on Mozilla">topic</a> added. Not only will I get to know many more interns with whom I am going to live in <a href="http://en.wikipedia.org/wiki/Mountain_View,_California" title="Mountain View (Wikipedia)">Mountain View</a>, and not only will I participate in the Mozilla project together with all the great people at the MoCo HQ. But also I will be attending the Mozilla Summit, the biennial meeting of people from all over the world that made great projects such as the Firefox web browser and <a href="http://addons.mozilla.org" title="Mozilla Addons">AMO</a> possible.
</p>
<p>
My internship position will be at the <a href="http://blog.mozilla.com/metrics/" title="Mozilla Blog of Metrics">metrics department</a> led by Ken Kovash and quite probably I will be allowed to go into the details of my project there, either at this blog or at a Mozilla blog.
</p>
<photo slug="leaving-aachen" size="display">Leaving for CA</photo>
<p>
If you plan to go abroad to the U.S. for an internship, I suggest you apply for the internship position(s) of your choice at least two months before the actual start of the internship. I was a bit late to the party and that led to a rather tight schedule: As a Germany based student at RWTH Aachen University, I had to invest some time in getting the visa. But fortunately there is a very helpful <a href="http://cicdgo.com/" title="CICD">visa sponsoring partner</a>, so everything went smoothly after all.
</p>
<p>
I do not know about other areas, but as a student in computer science you can expect compensation for an internship in the U.S. which is not necessarily the case in Germany. I applied at two organizations, and in both cases their offers covered living expenses and the flight to California. So I really do recommend that next spring you visit the web site of any company or organization you always wanted to get to know, and apply for an internship there. Make sure that the professional and academic experience on your resume matches the position you apply for, and prepare for two to three phone interviews.
</p>
OneSocialWeb: more than Jabber for Apps2010-04-30T14:00:21ZMichael Kurzehttp://www.thefoundation.de/about/michaelonesocialweb-more-than-jabber<p>Almost a month ago, the presentation of <a href="http://onesocialweb.org" title="OneSocialWeb">OneSocialWeb</a> at the android developers conference <a href="http://droidcon.be" title="Droidcon 2010 Belgium">droidcon.be</a> was one of the most interesting talks there. Recently the XMPP-centric framework has gone open source.</p><p>
Last year a group of fellow students and myself were tasked with creating an android applications to organize meetings spontaneously (think something like doodle, only mobile and more short term). At that time we were thinking about using <a href="http://de.wikipedia.org/wiki/Extensible_Messaging_and_Presence_Protocol" title="Extensible Messaging and Presence Protocol">XMPP</a> for real time communication, but were hesitant because of the time this would cost us to implement. In the end we used a traditional REST-based web service rather than a peer-to-peer system. Luckily there already is an effort underway which is called OneSocialWeb, funded by the telco provider Vodafone. It allows people to work together on Java-objects using XMPP.
</p>
<p>
This means that all the work that has to do with XMPP protocol handling and conflict management will be handled by this abstraction layer, while we developers can focus on delivering useful application. You could use this for simple things like associating chat conversations with arbitrary objects in your application. You could also try to model your entire application domain around this collaboration: In his presentation at droidcon <a href="http://eschnou.com/" title="Blog of Laurent Eschenauer">Laurent Eschenauer</a> demonstrated this using a collaborative shopping list where each participant could check off items, notifying the others immediately.
</p>
<p>
The Android platform with its services-model might really help in getting this concept to work, as XMPP protocol handling could be handled by one central service, dispatching updates to any interested Activities. This could well become the bidirectional, decentralized alternative to Apple’s Mobile Push service.
</p>And Apos Semicolon: A Cathapostrophe2010-03-25T16:32:32ZMichael Kurzehttp://www.thefoundation.de/about/michaeland-apos-semicolon-cathapostrophe<p>This morning on Facebook syndication, I reviewed the <a href="http://www.thefoundation.de/michael/2010/mar/24/thoughts-on-android-platform/" title="Thoughts on the Android">article on android</a> that I wrote yesterday. And one of the few HTML-incompatible XHTML-properties assaulted my eyes, impersonated by a bunch of entity references.</p><p>Specifically, I had escaped the <em>typewriter apostrophe (')</em> using named entity reference syntax (&apos;). Unfortunately, I had forgotten that — while this entity is defined by XHTML 1.0 — it is actually illegal in plain ol’ HTML. This should not have been a problem, as these pages are served using the XHTML 1.0 doctype where &apos; points to the Unicode code point 0x27, so that you can use single quotes to delimit attributes. </p>
<p>The Django RSS framework however would put a plain "html" content-type into the Atom-Feed, so the references to the apostrophe remained unresolved when the Feed readers converted my contents for display. Instead, they correctly escaped the ampersand, which led to a lot of ugly entity references on my facebook feed.</p>
<p>So for now I am going to reference the apostrophe using the Unicode code point reference &#x2019; (<em>punctuation apostrophe: ’</em>) which is actually recommended over the ASCII-compatible &#x0027; (<em>typewriter apostrophe: '</em>). Strictly speaking, I would not even need to use any entityref here, as 0x2019 is not XML syntax. Next I need to figure out if there is a way to configure the <a href="http://docs.djangoproject.com/en/dev/ref/contrib/syndication/" title="The Django syndication feeds framework">Django feeds framework</a> to use XHTML as a content type for Atom feeds and to check if the results are real-world-compatible.</p>
<p>But really, this just shows once more that it is absolutely inhumane to edit XHTML by hand. So I’ll be looking for a suitable <a href="http://en.wikipedia.org/wiki/WYSIWYM#In_web_environments" title="What you see is what you mean">WYSIWYM</a> editor to maybe handle this stuff.</p>Die Klimaschützer2009-07-06T17:28:16ZDaniel Beckerhttp://www.thefoundation.de/about/danieldie-klimaschutzer<p>Start using green energy – install the <a href="http://apps.facebook.com/dieklimaschuetzer/">Carbon Clock</a> on your Facebook-Profile or homepage!</p><p>Finally the <a href="http://apps.facebook.com/dieklimaschuetzer/">Carbon Clock</a> is online! The Layout was created by <a href="http://www.widjet.de/">widjet</a>, my task was the Flash and programming stuff. It is my first <a href="http://facebook.com">Facebook</a>-Application! After some struggle with the <a href="http://developers.facebook.com/">API</a> the application works fine – needless to say, I am a bit proud!<br />Depending on the installs of this application, the Carbon Clock shows how many carbon dioxide is prevented from emitting by users of green energy.</p>
<div style="width:184px;height:250px;text-align:center"><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=9,0,0,0" width="184" height="225" id="CarbonClock" align="middle"> <param name="allowScriptAccess" value="always" /> <param name="allowFullScreen" value="false" /> <param name="movie" value="http://klima.stromauskunft.de/application/media/main.swf?wid=4a4dfaa1d1b04826&pid=4a5214a7bd60dc6b" /> <param name="quality" value="high" /><param name="bgcolor" value="#ffffff" /> <param name="flashvars" value="wid=4a4dfaa1d1b04826&pid=4a5214a7bd60dc6b&getAppUrl=http://www.clearspring.com/widgets/4a4dfaa1d1b04826" /> <embed src="http://klima.stromauskunft.de/application/media/main.swf?wid=4a4dfaa1d1b04826&pid=4a5214a7bd60dc6b" flashvars="wid=4a4dfaa1d1b04826&pid=4a5214a7bd60dc6b&getAppUrl=http://www.clearspring.com/widgets/4a4dfaa1d1b04826" quality="high" bgcolor="#ffffff" width="184" height="225" name="CarbonClock" align="middle" allowScriptAccess="always" allowFullScreen="false" type="application/x-shockwave-flash" pluginspage="http://www.macromedia.com/go/getflashplayer" /></object><a href="http://www.stromauskunft.de/de/html/oekostrom/klimaschuetzer.html" style="font-size:12px;color:#6198C2;font-family:Helvetica,Arial,sans;display:block;padding-top:4px">Mach mit! Werde Klimaschützer!</a></div>
<p>Click the <q>Wer macht mit?</q>-Button to <a href="http://www.clearspring.com/widgets/4a4dfaa1d1b04826">install the widget</a> on your homepage or go to <a href="http://apps.facebook.com/dieklimaschuetzer/">the Facebook-Application</a> and install <q>Die Klimaschützer</q> to your profile!</p>
<p>Since I felt lost sometimes during the development, I soon will write down my experiences and techniques in a short article.</p>
<p>
<script type="text/javascript" src="http://static.ak.connect.facebook.com/js/api_lib/v0.4/FeatureLoader.js.php/en_US"></script>
<script type="text/javascript">FB.init("590725159c942c465f24ecb1efdc2d2d");</script>
<fb:fan profile_id="69907595492" stream="" connections="" width="400"></fb:fan>
</p>It appears I was wrong2009-04-08T23:18:11ZMichael Kurzehttp://www.thefoundation.de/about/michaelit-appears-i-was-wrong<p>Yesterday, Google <a href="http://googleappengine.blogspot.com/2009/04/seriously-this-time-new-language-on-app.html" title="Seriously this time, the new language on App Engine: Java™">announced</a> the availability of Java as the new programming language for the App Engine, refuting <a href="http://www.thefoundation.de/michael/2008/sep/21/javascript-next-app-engine-language/" title="Is JavaScript The Next App Engine Language?">my guess</a> from last year that it might be JavaScript — though of course, not entirely.</p><p>
If you take the demand of the users into account, Java is of course the right choice as the next language. It might just alienate lots of Java-Developers if a niche language of the server zoo such as JavaScript was to emerge first. Additionally, Java is one Step short of full (albeit sandboxed) <abbr title="Java Virtual Machine">JVM</abbr> support.
</p>
<p>
To allow for sandboxing, Google wraps some of its own <abbr title="Application Programmer Interface">API</abbr>'s into Java SE or <abbr title="Java Specification Request">JSR</abbr>–standardized services such as <tt>javax.mail</tt> and <tt>java.net.URL</tt>. Also, there is a <a href="http://code.google.com/appengine/docs/java/jrewhitelist.html" title="The JRE Class White List">white-list</a> currently containing 1323 of the 3700+ <a title="Overview (Java Platform SE 6)" href="http://java.sun.com/javase/6/docs/api/">Java SE 6</a> classes. Most of the classes that are not available are from the Swing and AWT suites which a web developer will not need anyway. Instead, Google provides the homebrewn <abbr title="Google Web Toolkit">GWT</abbr>.
</p>
<p>
Thanks to the free and open source <a href="http://www.mozilla.org/rhino/" title="Mozilla Rhino: JavaScript for Java">Rhino</a> JavaScript Interpreter written in Java, server side JavaScript on the App Engine is rather easy to achieve now. I guess I might have to check it out and report back about it later, so I just signed up for the Java technology preview on my App Engine account.
</p>
<p>There are <a title="Campfire One: App Engine Redux" href="http://www.youtube.com/view_play_list?p=DFDBB63922B90A70">some videos</a> from the Google Campfire event over at Youtube. Most of the time they are rather interesting, plus Kevin Gibbs does a pretty decent imitation of Steve Jobs during the presentation (voluntary or not).</p>
How to bring AS3 Tweens to an end2009-03-24T19:28:52ZDaniel Beckerhttp://www.thefoundation.de/about/danielhow-bring-as3-tweens-end<p>Have you ever assigned a tween to a local variable? Have you ever detected non-determining tweens, tweens which do not start at all? Blame ActionScript 3.0's ruthless garbage collector!</p><p>Back then in the old days of ActionScript 2.0 scripts like the following were just as easy as useful to program sequences of animations (<abbr title="exempli gratia">e. g.</abbr> assembling animations for site sections):</p>
<code class="block">
function transitionToolTip():void {
var toolTipAlpha:Tween = new Tween( toolTip, "alpha", Regular.easeOut, 0, 1, .75, true );
toolTipAlpha.addEventListener( TweenEvent.MOTION_FINISHED, handleToolTipTransition );
}
function handleToolTipTransition( e:TweenEvent ):void { … }
</code>
<p class="annotation notice right">Just to make things clear: The ActionScript 3.0 garbage collector does a good job in deleting these local variables. Obviously the ActionScript 2.0 garbage collector should have done this way, too.</p><p>In most cases one does not want to stop or modify tweens after firing them, thus one won't need a reference to the exact tween ever again. The next function would be triggered after the tween has finished. Great!<br />But, nowadays things are different. ActionScript 3.0's garbage collector seriously and unrelentingly works on what it is supposed to work: deleting everything with a lack of relation. In the sample case, shown above, this clearly means that the variable <code>toolTipAlpha</code>, declared with local scope of function <code>transitionToolTip</code> will be deleted – taking along our tween if the garbage collection cycle is awkward. In consequence our tween will not play to its end.</p>
<p>One solution would be the declaration of a class level or global variable to store the reference to the tween. This would look something like this:</p>
<code class="block">
var toolTipAlpha:Tween;
function transitionToolTip():void {
toolTipAlpha = new Tween( toolTip, "alpha", Regular.easeOut, 0, 1, .75, true );
toolTipAlpha.addEventListener( TweenEvent.MOTION_FINISHED, handleToolTipTransition );
}
function handleToolTipTransition( e:TweenEvent ):void { … }
</code>
<p class="annotation notice center">All the samples are in short form and not taken from real-life ActionScript 3.0 projects thus may not work properly. They are inserted for illustration purposes to show up possible problematics.</p>
<p>A far more elaborated description of the coexistence of tweens, variables and the garbage collector can be found on <a title="Scott Morgan – LA Flash and Flex Developer" href="http://www.scottgmorgan.com/">Scott Morgan's</a> <a href="http://www.scottgmorgan.com/blog/">blog</a> in the article <a title="AS3 Garbage Collection, the reason your tweens are ending early." href="http://www.scottgmorgan.com/blog/index.php/2007/11/18/as3-garbage-collection-the-reason-your-tweens-are-ending-early/">AS3 Garbage Collection, the reason your tweens are ending early.</a></p>
<p><a hreF="http://www.scottgmorgan.com/blog/index.php/2007/11/18/as3-garbage-collection-the-reason-your-tweens-are-ending-early/">via Scott Morgan's blog – http://www.scottgmorgan.com/</a></p><p class="annotation notice center">Sorry! Our Ping- and Trackback is not yet working.</p>Stuck for gift ideas?2008-12-17T00:35:30ZDaniel Beckerhttp://www.thefoundation.de/about/danielstuck-gift-ideas<p>Santa will help you out! Get some great last minute tipps from <a href="http://www.myspace.com/thewidjetsanta">Santa</a> himself directly on your own homepage!</p><h2>Santa's Last Minute Tipps</h2>
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=9,0,0,0" width="100%" height="330" id="Santa" align="middle">
<param name="allowScriptAccess" value="always" />
<param name="allowFullScreen" value="false" />
<param name="movie" value="http://widjet.de/santa/santaWidget/santaWidget.swf" />
<param name="quality" value="high" /> <param name="wmode" value="transparent" />
<embed src="http://widjet.de/santa/santaWidget/santaWidget.swf" quality="high" width="100%" height="330" name="Santa" align="middle" allowScriptAccess="always" allowFullScreen="false" type="application/x-shockwave-flash" pluginspage="http://www.macromedia.com/go/getflashplayer" wmode="transparent" />
</object>
<h2>Get Santa on your homepage!</h2>
<p>This widget was programmed using the <a href="http://www.clearspring.com/">Clearspring</a> <a href="http://www.clearspring.com/services/launchpad/in-widget">In-Widget-<abbr title="application programming interface">API.</abbr></a> The layout of <a href="http://www.myspace.com/thewidjetsanta">Santa's MySpace-Site</a> and the widget was created by <a href="http://widjet.de/">widjet</a> – a cologne based agency, focused on <a href="http://en.wikipedia.org/wiki/Widget_engine">widget-development.</a><br />My part was the implementation of the widget in <a href="http://www.adobe.com/products/flash/">Adobe Flash.</a> After being doomed with the <a href="http://www.clearspring.com/services/launchpad/in-widget">Clearspring-<abbr title="application programming interface">API</abbr></a> I had fun animating Santa and his Tipps!</p>
<p>You can really look forward to see Santa's next Tipps – and be sure not to miss a single one!</p>Google Street View car spotted in Cologne2008-10-18T22:50:00ZMatthias Schulzhttp://www.thefoundation.de/about/matthiasgoogle-street-view-car-spotted-cologne<p>I recently saw a strange car standing outside my house. I'm pretty sure it's the Google Street View car taking photos of Cologne! I managed to take some pictures...</p>I'm sorry for the bad image quality but I only had my mobile on me and the camera is pretty bad...<br/>
There is this tripod construction on top of the car roof where presumably the camera rig is mounted on. I wonder how fast the car can go without creating too much motion blur...well, I certainly don't want to be the guy who has to deal with all the photos they take day by day ;-)
<p>
<gallery slug="google-street-view-car">Google Street View Car in Cologne</gallery>
</p>Is JavaScript The Next App Engine Language?2008-09-21T04:00:28ZMichael Kurzehttp://www.thefoundation.de/about/michaeljavascript-next-app-engine-language<p>As <em>the</em> language of the browser, JavaScript has become a kind of common denominator among web programmers. On the server it is rare compared to Java, PHP, Ruby or Python. Could this change, since Google has built an implementation of JS <em>and</em> an affordable yet scalable web application infrastructure?</p><h2>Scripting Language of The Open Web</h2>
<p>
JavaScript was created as a scripting language for the Netscape 2.0 browser. Other than its namesake Java, it has remained in the browser since then. Web Developers know this language because they have to use it, no matter if they like it. Throughout the late 90s and well into this decade, JavaScript was mostly regarded as a toy language for silly effects and simple form checks. Due to the more recent experience with Ajax applications and the resulting thorough coverage in technical literature, the users of JavaScript have become more professional.
</p>
<h2>The Server Domain</h2>
<p>
The development on the server side has — with some notable exceptions — ignored the language JavaScript for many years. Perl was a dominant language in the early days. Then, PHP gained momentum in the fast growing developer community during the internet bubble due to its gentle learning curve, the simple integration with MySQL and <em>the overwhelming availability of inexpensive shared hosting</em>. You could say that in its ubiquity, PHP has mimicked JavaScript, only on the server.
<br />
Other languages and tool chains have gained momentum since then, most notably Ruby, Java and Python. They are not as affordable as PHP (except Python on the App Engine), and seem to be used mostly because of their modern <abbr="Model, View, Controller">MVC</abbr> frameworks.
</p>
<h3>Why JavaScript Is Not Popular on The Server</h3>
<p>
JavaScript has some fundamental weaknesses that have inhibited its inception as a server side programming language:
</p>
<ul class="block">
<li>JavaScript, — or rather ECMAScript — has no real standard library, it is just a language. There are no tools for file or process manipulation. There are no database bindings and there is no shared memory. There is no way to open a socket.</li>
<li>In the past, JavaScript Interpreters have not been known for their performance. Anyone who has seen the message <q>A script on this page may cause Firefox to run slow.</q> would think twice before writing an application in JavaScript.</li>
</ul>
<p>
The <a href="http://dev.helma.org/">Helma</a> framework tries to address the first issue by using the Rhino Interpreter, making the Java SDK available to the JavaScript programmer. The second issue is at least partially resolved as some performance intensive tasks are moved to Java. In Java 6, Rhino is even bundled with Java. Nevertheless Helma and similar integration efforts are not very popular, compared to Rails or Struts. A reason for this might be that Java for the web (the <em>enterprise edition</em>) is a very serious business, while the perception of JavaScript has long been ...well, the opposite of serious. The two cultures do not seem to mix well yet, as good as the technologies might blend.
</p>
<h3>Why The App Engine Loves JavaScript</h3>
<p>
When Guido van Rossum adopted Python for the Google App Engine, he had to remove lots of features because they would not work with the sandboxed, replicating environment which is the App Engine. It turns out that the limitations of JavaScript really make it an ideal candidate for the App Engine.
</p>
<ul class="block">
<li>JavaScript is just a language: It cannot create sockets, files or processes. App Engine does not allow for that anyway, so this is a plus!</li>
<li>The language is easily sandboxed. Web Browsers have to do this all the time. This is ideal for a shared hosting environment.</li>
<li>It has a no database bindings included, but a natural serialization format, <a href="http://www.json.org" title="JavaScript Object Notation">JSON</a>. This is great for the Google Data Store, because it stores objects, not table rows.</li>
<li>Recently, JavaScript Interpreters (or, lets say <em>JavaScript Machines</em>) have become fast. There is <a href="http://code.google.com/p/v8/" title="">V8</a> and it is fast.</li>
</ul>
<p>
It is known (at least since Guido van Rossums talk at DjangoCon 2008) that Google plans to adopt more languages for the app engine. Also, there are four primary languages that Google uses for development: Python, Java, C++ and JavaScript. Python is already available. Java is a good candidate for the App Engine, but it is not exactly lightweight certainly not <em>shared nothing</em>. C++ is probably the opposite of any sandboxable language.
</p>
<p>
In combination with a template engine, a free offer on the App Engine might give the already ubiquitous JavaScript language a final boost. The prospect of <em>one language</em> from model definition to client side computation is certainly compelling!
</p>
<p>
To encourage further speculations: With Google's V8 machine comes an embedding, <tt>process.cc</tt>, that contains <q>code necessary to extend a hypothetical HTTP request processing application - which could be part of a web server, for example (...)</q> (<cite><a href="http://code.google.com/apis/v8/samples.html" title="Sample Code — V8 JavaScript Engine Documenation">V8 Documentation</a></cite>).
</p>Nobody Expects The Production Problems2008-09-21T00:08:05ZMichael Kurzehttp://www.thefoundation.de/about/michaelnobody-expects-the-production-problems<p>Sometimes you need to track down problems in a production setup. How the combination of Django and Flup make this difficult, and why I think that both should provide more than <q>log by mail</q>.</p><h2>Our Setup</h2>
<p class="annotation right"><q>Our chief weapon is development!</q>
<cite><a href="http://people.csail.mit.edu/paulfitz/spanish/script.html" title="The Spanish Inquisition">— Cardinal Ximinez of Spain</a></cite>
</p>
<p>
To the developer, the <a href="http://www.djangoproject.org" title="Django Project Site">Django</a> web development Framework offers a set of debugging helpers such as the verbose error page with interactive traceback and the management commands. The automatic reloading of python modules allows you to instantaneously see changes to your application, without even losing running sessions. If that is not enough for you, you might be interested in the powerful <a href="http://code.google.com/p/django-command-extensions/" title="Django Command Extensions, Project Site">Django Command Extensions</a>. The <a href="http://rob.cogit8.org/blog/2008/Sep/19/introducing-django-debug-toolbar/" title="Rob Hudson:
Introducing the Django Debug Toolbar">Django Debug Toolbar</a> also looks promising yet rather unfinished. I might come back to that another time.
</p>
<h2>The Other Setup</h2>
<p class="annotation right">
<q>... and production! Our two weapons are development and production!</q>
</p>
<p>
Web applications written in Django are also highly portable: Development can take place on a different <abbr title="Operating System">OS</abbr> than production will, using a different version of Python and different database and HTTP systems. With respect to debugging, these alternatives are equalized by the abstraction layers of the Python and Django stack — and fairly well. Once you have sorted out the do's and dont's of your <abbr title="Relational Database Management System">RDBMS</abbr>, there is the decision on how to handle HTTP requests. Django <a title="Deploying Django | Django Documentation" href="http://docs.djangoproject.com/en/dev/howto/deployment/">recommends</a> using the Apache HTTP Server and <tt>mod_python</tt>. Another performant <a href="http://docs.djangoproject.com/en/dev/howto/deployment/fastcgi/" title="How to use Django with FastCGI, SCGI or AJP | Django Documentation">setup</a> that seems to be quite common is using a <a href="http://nginx.net/" title="nginx">modern</a> FastCGI-capable Web Server in combination with the <a href="http://trac.saddi.com/flup" title="flup Project">flup</a> FCGI/WSGI bridge. This is what I did.
</p>
<h4>Production: Turn Off the Debug Switch!</h4>
<p>
When preparing your production setup, you will want to look into your <code>settings.py</code>. Actually, you will probably have multiple settings files of different names in revision control. Then, you will create a symlink to the settings matching the current setup. In your production settings, you will turn off debugging. This replaces the verbose traceback pages with neat-looking <abbr title="Internal Server Error">500</abbr> views which will — hopefully — never see the light of day! And otherwise, you will receive a nice error report somehow, right?
</p>
<p>
Of course you will probably encounter a production error in your site, be it a bunch of downloaded third party applications or your own magnificent creation. In my case, I soon had <a href="http://joseph.randomnetworks.com/archives/2005/08/05/postgresql-index-limitation-index-row-size-xxxxx-exceeds-btree-maximum-2713/" title="PostgreSQL Index Limitation">problems with an index</a> on a text column.
</p>
<h2>The Third Setup</h2>
<p class="annotation right">
<q>... and ruthless stage testing! Our <em>three</em> weapons are development, production, and ruthless stage testing.</q>
</p>
<p>
To catch such problems before they occur with your production setup, it is highly desirable to use a <em>stage setup</em>. This setup has its own database, usually a copy of your production database, updated on demand. The other settings should mirror your production setup. Run your staging application on the production machine or — if you can afford it — on a separate identical system and make it accessible to selected testers only. For example, you might make it listen to the local IP only and ssh-tunnel from your development machine there. On this machine, you can turn on debugging whenever needed.
</p>
<h3>Problem One: Logging by E-mail</h3>
<p>
Of course, there might still be problems that are discovered in your production setup. Unfortunately, out of the box Django reports errors only <a href="http://docs.djangoproject.com/en/dev/howto/error-reporting/" title="Error reporting via e-mail | Django Documentation">via e-mail</a>. To me, this has some critical drawbacks:
</p>
<ul class="block">
<li>You do not know if the error reporting works unless a problem occurs. In this case, you will only know that error reporting works <em>if it works</em>. Avoid this problem by enabling <abbr title="Resource not found">404</abbr> reporting and then testing an invalid URL!</li>
<li>It is <em>not secure</em>! To my knowledge there is no PGP-support for the error mails sent by Django. The messages might contain sensitive data that should not be sent unencrypted.</li>
<li>If your production setup does not work right away, you will need debugging information in fast iterations. Obtaining such information by mail is inconvenient as it is tiresome to link a browser-click to an e-mail that you might receive minutes later.</li>
<li>Your machine might not be allowed to send mails! Many <abbr title="System Operators">sysops</abbr> put firewall rules in place to prevent machines in their domain from becoming spam robots. This is what I did on my production machine, so fortunately I knew in advance that debugging by mail would not work. I am not sure if every django admin is aware of this potential problem.</li>
<li>If you host a high traffic site and an error occurs (for example if you database does not respond due to slashdotting effects or denial of service attacks), the volumes of e-mail it generates might be so high that the error reporting competes with your application on IP-connections, worsening your problems.</li>
</ul>
<h4>Solution: Exception-Logging Middleware</h4>
<p>
For my basic needs, I have written a simple <a href="/files/michael/python-modules/exception_handling.py" title="Exception Logging middleware">middleware</a> that should handle any exceptions raised by views in your production setup. Please note that this middleware relies on the <a href="http://code.activestate.com/recipes/444746/" title="ActiveState Code Recipe 444746">Exception Helpers</a> module written by <a href="http://www.targeted.org/" title="Dmitry Dvoinikov&squot;s Homepage">Dmitry Dvoinikov</a> to format tracebacks.
The traceback information is written into the logfile specified by the (custom) <tt>LOG_FILE</tt> setting. If the <tt>LOG_LEVEL</tt> is at least as verbose as <tt>DEBUG</tt>, the request object is dumped as well.
</p>
<h3>Problem Two: Exceptions Outside of Views</h3>
<p>
We are now prepared for errors that might occur when running your views production. But there is a different class of exceptions that cannot be handled by our middleware because of their nature:
</p>
<p>
On your Site, you probably want caching, <a href="http://en.wikipedia.org/wiki/HTTP_ETag" title="HTTP ETags">ETags</a> and transfer compression. So you enable the corresponding middleware, hopefully in the right <a href="http://phaedo.cx/archives/2007/07/26/django-middleware-order/" title="Django middleware order">order</a>, as <a href="http://effbot.org/zone/zone-django-notes.htm#middleware-order" title="Django Performance Observations">explained</a> by Fredrik Lundh.
And you might just add some middleware from <a href="http://www.djangosnippets.org/tags/middleware/" title="Django Snippets: Tag "Middleware"">djangosnippets</a> or some of your own. Perhaps you want to prettyprint or to simplify your output, or to yield custom error pages for Ajax-Requests.
</p>
<h4>Solution: Enable Flup Traceback Pages</h4>
<p>
Any exceptions raised during middleware processing are not caught and handled by Django, but instead propagated up to the flup FastCGI handler. By default, Flup would present an error page like Django does in debugging mode, which is quite helpful (if not as beautiful). But Django <a href="http://code.djangoproject.com/changeset/4170" title="Django Trac — Changeset 4170">disables</a> this error page explicitly during startup. I filed a <a href="http://code.djangoproject.com/ticket/6610" title="Ticket #6610">patch</a> that allows to specify a debug flag to your FastCGI process to remedy this issue.
</p>
<h3>Summary</h3>
<p class="annotation right">
<q><em>Amongst</em> our weaponry are such diverse elements as development, production, ruthless stage testing, almost fanatical verboseness to the log and nice red flup tracebacks, my oh my oh my.</q>
</p>
<p>
To anticipate and tackle production bugs, the solution consists of these parts:
</p>
<ul>
<li>separate staging and production setups</li>
<li>use exception logging middleware</li>
<li>enable flup error pages in the stage setup</li>
</ul>
<p>
This is still not perfect. Ideally, flup would also log errors in production mode. Weirdly, flup also knows
<em>log by e-mail</em>, and in this case it is not even configurable from your Django app! I will probably complement this article when I find an elegant way to fix this.
</p>
Integrating a plug-in into JW-Player 2008-09-20T02:06:41ZDaniel Beckerhttp://www.thefoundation.de/about/danielintegrating-plugin-jw-player<p>If you spent time watching web videos in the popular <abbr title="Flash Video"><a href="http://www.adobe.com/devnet/flash/articles/video_guide.html">FLV</a></abbr> format you might have stumbled upon the <a href="http://www.jeroenwijering.com/?item=JW_FLV_Media_Player">Jeroen Wijering player</a>. Version 4.1 features a plug-in interface – which I found pretty much undocumented.</p><p>In the following article I will try to give you a short insight on how to integrate plug-ins into the free and open-source Flash video player by <a href="http://www.jeroenwijering.com/">Jeroen Wijering</a>. All techniques described in this article worked with version 4.1.</p>
<h2>JW-Player plug-ins</h2>
<p class="annotation notice center">There are some restrictions for commercial use of the player. You have to apply for a commercial license at <a href="http://www.jeroenwijering.com/?page=order">http://www.jeroenwijering.com/?page=order</a></p>
<h3>What is JW-Player?</h3>
<p>In my eyes this player is a masterpiece in structured programming regarding the <abbr title="Model-View-Controller"><a href="http://en.wikipedia.org/wiki/Model-view-controller">MVC</a></abbr> architectural pattern. The source is as modular as possible so it can easily modified to fit new requirements. As an example it can be adapted to display videos from other sources (<abbr title="exempli gratia">e. g.</abbr> <a href="http://youtube.com/">YouTube</a>) by adding a new class as a <q>submodel</q>. The player allows <a href="http://code.jeroenwijering.com/trac/wiki/FlashSkinning">skinning</a> and provides a wide range of <a href="http://code.jeroenwijering.com/trac/wiki/FlashVars">settings</a> as well as an useful <a href="http://code.jeroenwijering.com/trac/wiki/FlashAPI">JavaScript interface</a>. While the majority of possibilities is documented in the <a href="http://code.jeroenwijering.com/trac">Wiki</a> I could not find any helpful information about the plug-ins. Maybe due a lack of time, perhaps because this possibility could be used to emulate the functionality of <a href="">longtail video</a> – a side project of Jeroen Wijering.<br />But this is just speculation so let us start over! This article will not give you turnkey sources nor will it be a full tutorial on how this or that could be realized with the JW-Player. I presume that you have already been dealing with Adobe Flash in a basic matter, so you can use my depictions as foundation for your own projects.</p>
<h3>How to integrate plug-ins</h3>
<p class="annotation warning left">You will need to have the <a href="http://www.miniml.com/">kroeger 05/63</a> pixel font installed / activated to not break the default player skin layout!</p><p>First you should get the source files and create an <abbr title="HyperText Markup Language">HTML</abbr> file which loads the JW-Player in order to set up a test-bench. In the next step we will create a Adobe Flash file to house our plug-in functionality. First and foremost the size and frame rate does not matter, however I suggest you to choose a size around 300x200 pixels – which should be <del>both, somewhat around the default as well as </del><ins>slightly smaller than the default size and </ins>the smallest display, the video player should be used with. This way you can layout your plug-in and perhaps add a script for scalable display later. Wether you will use scripts or not, you need to define a function called <q>initialize</q>. Otherwise the player will throw an error!<br />The plug-in <abbr title="ShockWave File"><a href="http://www.adobe.com/devnet/swf/">swf</a></abbr> should be published somewhere relative to your test <abbr title="HyterText Markup Language">HTML</abbr> file at best similar to later production environments. In the following code snippets I will assume, that the plug-in is placed in a subdirectory <q>media/flash/plugins/</q> and is called <q>plug-in.swf.</q><br />We will now pass some values to the JW-Player swf making use of the so-called flashvars. (I am sorry, I really tried to find some information about flashvars on the official sites of Adobe. I could not find anything helpful even in the <a href="http://livedocs.adobe.com/flash/9.0/">livedocs</a>. Except some outdated samples for <a href="http://kb.adobe.com/selfservice/viewContent.do?externalId=tn_16417">Flash 6</a> and the use with <a href="http://www.adobe.com/livedocs/flash/9.0/main/wwhelp/wwhimpl/js/html/wwhelp.htm?href=00000887.html">ActionScript 2.0</a> which in this case are best of a basic use to understand what we are talking about. Perhaps someone can help out here? As there are several common ways to implement Adobe Flash in todays webpages I will attach a short paragraph containing samples of flashvar use at the end of this article.) Let us give the JW-Player a hint where to look for the plug-in. Add the flashvar <code>plugin</code> with the relative path to our plug-in as value, in our case <code>media/flash/plugins/plug-in.swf</code>. Your <abbr title="HyperText Markup Language">HTML</abbr> code should look somewhat like this:</p><code class="block"><object type="application/x-shockwave-flash"
data="player.swf"
width="300px"
height="200px" >
<param name="allowScriptAccess" value="sameDomain" />
<param name="allowFullScreen" value="true" />
<em><param name="flashvars" value="plugins=media/flash/plugins/plug-in.swf" /></em>
<param name="quality" value="high" />
<p>Please get the Adobe Flash Player!</p>
</object>
</code>
<p class="annotation notice right">Be sure to use the relative path originated at the <abbr title="HyperText Markup Text">HTML</abbr> document – not the <abbr title="ShockWave File"><a href="http://www.adobe.com/devnet/swf/">swf</a></abbr>!</p><p>Yes, it is as easy as that! Now your plug-in should be loaded automatically. But wait! Offline everything works fine, but the Flash player is not able to find the plug-in on the server. After some investigation, we know why there are troubles: We have to navigate through the ActionScript package of the JW-Player to the main view class (com → jeroenwijering → player → View.as). Just at the end of the class header there is a private string <q>directory</q> with an <abbr title="Uniform Resource Locator"><a href="http://en.wikipedia.org/wiki/URL">URL</a></abbr> assigned as value: <q>http://plugins.longtailvideo.com/</q>. Perhaps this is a hint at how the plug-in interface usually gains utilization. If you set the <q>directory</q> value to be an empty string (like this <code>private var directory:String = "";</code>) the whole thing should work online. (Better way to solve this problem would be, <q>take the deep dive into this piece of code and understand what really is happening here</q> before removing some hardcoded value.)<br />From now on the rest is just some ActionScript wherewith we can do some magic! Since the code of the video player is well-structured and nearly every event you could imagine is dispatched all you have to do is: listen to the certain part (Model, View, Controller) and invoke some action!</p>
<h3>The fun part – ActionScript!</h3>
<p>Here we go: Include the View-class and declare a variable to hold the reference to the players View-instance:</p>
<code class="block"> // includes
import com.jeroenwijering.player.View;
// variables
// Reference to View-instance of video player
var v_player:View;
</code>
<p class="annotation note right">At this point, lazy guys, there is no short version, no <q>init</q>, no programmers stenography!</p><p>And again, Jeroen Wijering made life with plug-ins easy. If you set up a function called <q>initialize</q> it will be called as soon as your plug-in is properly loaded by the player:</p>
<code class="block">…
// Called after loading of plugin is completed
function initialize( par_playerView:View ) {
v_player = View( par_playerView );
trace( "do something useful" );
}
</code>
<p>The simple call to do something does not help much. I will give you an example of what could come next. For example it would be handy if the plug-in content would be centered if the player is resized! Let us assume we have a movie clip containing a graphic (<code>mc_content</code>, movie clip registration point should be on the top left) which should be horizontally centered and positioned somewhere above the vertical center (since there is this nice button to start the playback). But this graphic should disappear on playback start. We have to start with adding some lines to our plug-ins code <q>header</q>, furthermore we will instantiate a transition manager to handle a nice fade out of our plug-in content:</p>
<code class="block"> // imports
// video player stuff
import com.jeroenwijering.player.View;
import com.jeroenwijering.events.ModelEvent;
import com.jeroenwijering.events.ModelStates;
import com.jeroenwijering.events.ControllerEvent;
// eye candy
import fl.transitions.Fade;
import fl.transitions.Tween;
import fl.transitions.easing.*;
import fl.transitions.Transition;
import fl.transitions.TransitionManager;
// variables
// Reference to View-instance of video player
var v_player:View;
// Transition manager
var trnsmngr_content:TransitionManager = new TransitionManager( mc_content );
// functions
// Configure standard transitions used to fade plugin content in and out
var obj_contentFadeOut:Object = new Object();
obj_contentFadeOut.type = Fade;
obj_contentFadeOut.direction = Transition.OUT;
obj_contentFadeOut.duration = .7;
obj_contentFadeOut.easing = Regular.easeOut;
var obj_contentFadeIn:Object = new Object();
obj_contentFadeIn.type = Fade;
obj_contentFadeIn.direction = Transition.IN;
obj_contentFadeIn.duration = .7;
obj_contentFadeIn.easing = Regular.easeOut;
…
</code>
<p>Now we have to set up some functions to handle the positioning of the content elements and kick-off the vanishing of this content element on playback.</p>
<code class="block">…
// Invoked on resize
function onPlayerResize( par_arg ) {
alignContent( Number( par_arg.data.width ), Number( par_arg.data.height ) );
}
// Align content movie clip
function alignContent( par_width:Number, par_height:Number ) {
mc_content.x = Number( par_width / 2 - mc_content.width / 2 );
mc_content.y = Number( par_height / 2 - ( mc_content.height + 40 ) );
}
// Invoked on status change of video player
// - if playback has finished: fade in plugin content
// - if playback is (re-)started: hide plugin content
function onPlayerStateChange( par_arg ) {
if( par_arg.data.newstate == "COMPLETED" ) {
trnsmngr_content.startTransition( obj_contentFadeIn );
}else if( par_arg.data.newstate == "PLAYING" && par_arg.data.oldstate == "COMPLETED" ) {
trnsmngr_content.startTransition( obj_contentFadeOut );
}
}
…
</code>
<p>Obviously, the next thing to care about: event listeners! The listeners are registered in our <q>initialize</q> function:</p>
<code class="block">…
// Called after loading of plugin is completed
function initialize( par_playerView:View ) {
v_player = View( par_playerView );
v_player.addControllerListener( ControllerEvent.RESIZE, onPlayerResize );
v_player.addModelListener( ModelEvent.STATE, onPlayerStateChange );
alignContent( v_player.config.width, v_player.config.height );
}
</code>
<p>Now everything should work fine. The content will be realigned if the player should ever be resized on runtime (not very likely) and faded out on playback start or in respectively on playback end. To initially align the content, there is one line added, which calls <code>alignContent</code>. For convenience, the whole clump of code:</p>
<code class="block"> // imports
// video player stuff
import com.jeroenwijering.player.View;
import com.jeroenwijering.events.ModelEvent;
import com.jeroenwijering.events.ModelStates;
import com.jeroenwijering.events.ControllerEvent;
// eye candy
import fl.transitions.Fade;
import fl.transitions.Tween;
import fl.transitions.easing.*;
import fl.transitions.Transition;
import fl.transitions.TransitionManager;
// variables
// Reference to View-instance of video player
var v_player:View;
// Transition manager
var trnsmngr_content:TransitionManager = new TransitionManager( mc_content );
// functions
// Configure standard transitions used to fade plugin content in and out
var obj_contentFadeOut:Object = new Object();
obj_contentFadeOut.type = Fade;
obj_contentFadeOut.direction = Transition.OUT;
obj_contentFadeOut.duration = .7;
obj_contentFadeOut.easing = Regular.easeOut;
var obj_contentFadeIn:Object = new Object();
obj_contentFadeIn.type = Fade;
obj_contentFadeIn.direction = Transition.IN;
obj_contentFadeIn.duration = .7;
obj_contentFadeIn.easing = Regular.easeOut;
// Invoked on resize
function onPlayerResize( par_arg ) {
alignContent( Number( par_arg.data.width ), Number( par_arg.data.height ) );
}
// Align content movie clip
function alignContent( par_width:Number, par_height:Number ) {
mc_content.x = Number( par_width / 2 - mc_content.width / 2 );
mc_content.y = Number( par_height / 2 - ( mc_content.height + 40 ) );
}
// Invoked on status change of video player
// - if playback has finished: fade in plugin content
// - if playback is (re-)started: hide plugin content
function onPlayerStateChange( par_arg ) {
if( par_arg.data.newstate == "COMPLETED" ) {
trnsmngr_content.startTransition( obj_contentFadeIn );
}else if( par_arg.data.newstate == "PLAYING" && par_arg.data.oldstate == "COMPLETED" ) {
trnsmngr_content.startTransition( obj_contentFadeOut );
}
}
// Called after loading of plugin is completed
function initialize( par_playerView:View ) {
v_player = View( par_playerView );
v_player.addControllerListener( ControllerEvent.RESIZE, onPlayerResize );
v_player.addModelListener( ModelEvent.STATE, onPlayerStateChange );
alignContent( v_player.config.width, v_player.config.height );
}
</code>
<p>There are some additional things you should know while working with this nice player and its plug-in possibilities. Let me give you some arbitrary dodges!</p>
<dl>
<dt><code>v_player.config.[any name]</code></dt>
<dd>If you pass your own flashvars to the JW-Player, you can get the value of certain variables by this path plus your variable name at the end. It would be easy to display the videos title if you pass the flashvar <q>title</q>. You just need a textfield and assign the value <code>v_player.config.title</code>.</dd>
<dt><code>v_player.config.volume</code>, <code>v_player.config.width</code>, <code>v_player.config.height</code>, <code>v_player.config.fullscreen</code></dt>
<dd>There is a bunch of information in the config object of the View-instance. Here you can get the current value, width and height of the player, as well the configuration wether fullscreen is allowed and implemented or not (Nevertheless you cannot alter the state of fullscreen here! See below). To get information about the current display state watch <code>v_player.skin.stage[ "displayState" ]</code><br />If you want to know more, have a look in the Player class' (com → jeroenwijering → player → Player.as) <q>defaults</q> object!</dd>
<dt><code>v_player.sendEvent( [event], [value] )</code></dt>
<dd>Using the function <q>sendEvent</q> you can take control of various parts of the player. Set volume to 50 % by using following code: <code>v_player.sendEvent( "VOLUME", 50 )</code>, set player into fullscreen mode: <code>v_player.sendEvent( "FULLSCREEN", true )</code> or seek to closest keyframe to second two: <code>v_player.sendEvent( "SEEK", 2 )</code>.</dd>
<dt><code>v_player.addViewListener</code>, <code>v_player.addModelListener</code>, <code>v_player.addControllerListener</code></dt>
<dd>Like shown in the sample, you have the possibility to add listeners to every part of the player (Model, View, Controller). For further information about the various events, have a look at the <q>events</q> part of the player class package (com → jeroenwijering → events → *).</dd>
</dl>
<p class="annotation hint center">In the above samples, I assume you work in the plug-in of this tutorial. Otherwise the <code>v_player</code> has to be replaced by your reference to the View-instance of the JW-Player!</p><p>This article is not meant as a full featured tutorial to a full working this-and-that-performing plug-in, rather an insight on the different parts of plug-in programming for JW-Player with some short examples.</p>
<p class="annotation warning center">I am not quite sure about the legal matters on the use of JW-Player with altered code. So please check the information on <a href="http://www.jeroenwijering.com/">Jeroen Wijerings homepage</a> or contact him to assure you are not cracking anything besides the legal regulations here!</p>
<h2>Flashvars in todays webpages</h2>
<p>As mentioned before, there are several ways of integrating Flash files into webpages. Today a lot of people use JavaScript to embed Flash movies, there is the famous <a href="http://code.google.com/p/swfobject/">swfobject</a> as well as the Standard Adobe Flash way to use <a title="Not exactly what I wanted, but for sure interesting further reading for the use of AC_FL_Content." href="http://www.adobe.com/devnet/activecontent/articles/devletter.html">AC_FL_RunContent</a>. One main reason to use JavaScript is the outline rectangle the InternetExplorer shows up and which forces you to click on the Flash movie once, before you can interact with the movie. Moreover there is a check if the correct version of Adobe Flash Player is installed on the client machine.<br />Next we have the former standard way using markup with <code>object</code>- and <code>embed</code>-tags to integrate flash. Finally there is one possibility, first shown in an <a href="http://www.alistapart.com/articles/flashsatay/">A List Apart</a> article (for more information please visit: <a href="http://latrine.dgx.cz/how-to-correctly-insert-a-flash-into-xhtml">http://latrine.dgx.cz/how-to-correctly-insert-a-flash-into-xhtml</a>) to integrate Flash XHTML-Strict valid with markup and without the <code>embed</code>-tag which is in fact a relic from times with Netscape.<br />For each of these methods I will give you a short code snippet on how to pass Flashvars. This is known and far from being new, but get this as a compact reference for the most popular ways to integrate Flash.</p>
<h3>Flashvars and markup methods for Flash integration</h3>
<p>First the markup methods, in the following example the former standard way:</p>
<code class="block"><object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"
codebase="http://download.macromedia.com
/pub/shockwave/cabs/flash/swflash.cab#version=6,0,0,0"
width="300px"
height="200px" >
<param name="allowScriptAccess" value="sameDomain" />
<param name="allowFullScreen" value="true" />
<param name="movie" value="player.swf" />
<em><param name="flashvars" value="plugins=media/flash/plugins/plug-in.swf&title=A%20shortfilm" /></em>
<embed src="player.swf"
<em>flashvars="plugins=media/flash/plugins/plug-in.swf&title=A%20shortfilm"</em>
quality="high"
width="300px"
height="200px"
allowScriptAccess="sameDomain"
allowFullScreen="true"
type="application/x-shockwave-flash"
pluginspage="http://www.macromedia.com/go/getflashplayer" />
<param name="quality" value="high" />
<p>Please get the Adobe Flash Player!</p>
</object>
</code>
<p>In this example we pass the variables <q>plugin</q> and <q>title</q> to the <abbr title="ShockWave File"><a href="http://www.adobe.com/devnet/swf/">swf</a></abbr>. You can append as much variables as you needed, as long as you keep the pattern: seperate the variables using the ampersand <q>&</q>, assign value by using the equal sign <q>=</q>: <code>&[variable name]=[value]</code>.<br />Now for the XHTML-Strict compatible method. Analogue to the sample at the beginning of this article, passing Flashvars would look like this:</p>
<code class="block"><object type="application/x-shockwave-flash"
data="player.swf"
width="300px"
height="200px">
<param name="allowScriptAccess" value="sameDomain" />
<param name="allowFullScreen" value="true" />
<em><param name="flashvars" value="plugins=media/flash/plugins/plug-in.swf&title=A%20shortfilm" /></em>
<param name="quality" value="high" />
<p>Please get the Adobe Flash Player!</p>
</object>
</code>
<p class="annotation notice right">Variable names must not contain any special characters (as usual in ActionScript), the values should be <a href="http://www.w3schools.com/TAGS/ref_urlencode.asp"><abbr title="Uniform Resource Locator">URL</abbr>-encoded</a>.</p><p>In the first example you have to declare the variables and values <u>twice</u> to assure, that every browser (using either the <code>object</code>- or the <code>embed</code>-tag) gets the correct values!</p>
<h3>The JavaScript way</h3>
<p>Since the JavaScript methods are kinda proprietary, you have to find the correct solution for every specific JavaScript Flash integration. So, the following examples feature specific code for the <q>AS_FL_RunContent</q> (first) and the <q>swfobject</q> (second) method:</p>
<code class="block"><script language="javascript">
if (AC_FL_RunContent == 0) {
alert("This page requires \"AC_RunActiveContent.js\".");
} else {
AC_FL_RunContent(
'codebase', 'http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=8,0,0,0',
'width', '300',
'height', '200',
'src', 'player',
'quality', 'high',
'pluginspage', 'http://www.macromedia.com/go/getflashplayer',
'id', 'player',
'name', 'player',
'allowFullScreen', 'true',
'allowScriptAccess','sameDomain',
'movie', 'player',
<em>'FlashVars', 'plugin=media/flash/plugins/plug-in.swf&title=A%20shortfilm'</em>
); //end AC code
}
</script>
<noscript>
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000"
codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=8,0,0,0"
width="300"
height="200"
id="player"
align="middle" >
<param name="allowScriptAccess" value="sameDomain" />
<param name="allowFullScreen" value="true" />
<param name="movie" value="player.swf" />
<em><param name="flashvars" value="plugins=media/flash/plugins/plug-in.swf&title=A%20shortfilm" /></em>
<embed src="player.swf"
<em>flashvars="plugins=media/flash/plugins/plug-in.swf&title=A%20shortfilm"</em>
quality="high"
width="300"
height="200"
name="player"
allowScriptAccess="sameDomain"
allowFullScreen="true"
type="application/x-shockwave-flash"
pluginspage="http://www.macromedia.com/go/getflashplayer" />
<param name="quality" value="high" />
<p>Please get the Adobe Flash Player!</p>
</object>
</noscript>
</code>
<p>Be sure to pass the Flashvars in the JavaScript code, as well as in the <code>object</code>-tag as <code>param</code>-tag and in the <code>embed</code>-tag as attribute <q>flashvars</q>.<br />Finally, the <q>swfobject</q> method:</p>
<code class="block"><script type="text/javascript" src="swfobject.js"></script>
<div id="playerContainer">
This is a flash video player.
</div>
<script type="text/javascript">
var so = new SWFObject("player.swf", "player", "300", "200", "9");
<em>so.addVariable("plugin", "media/flash/plugins/plug-in.swf");
so.addVariable("title", "A shortfilm");</em>
so.write("playerContainer");
</script>
</code>
<h2>Sample sources</h2>
<p>The Flash source codes are uploaded here:<br /><a href="/files/daniel/sources/jwplayer-plugin.zip">ZIP file containing plugin.fla</a><br />Set up an <abbr title="HyperText Markup Language">HTML</abbr> document with flashvars <q>plugins</q> and <q>title</q>:</p>
<code class="block">plugins=[path to plug-in.swf without extension]&title=[video title]</code>
<p>Outcome should be a headline showing the video title on the JW-Player <q>splash screen</q>.</p>Still Cracks in Our Foundation2008-09-09T14:57:48ZMichael Kurzehttp://www.thefoundation.de/about/michaelstill-cracks-in-our-foundation<p>Welcome to thefoundation.de, <a title="about us" href="/about/">our</a> aggregate blog on media and internet technology. There are still many <q>cracks</q> to fill...</p><p>
...but in accordance with the recent <a href="http://www.djangoproject.com/weblog/2008/sep/03/1/" title="Django Weblog: Django 1.0 released!">release</a> of the Django web development framework, we decided to push our site out of the door as well. This blog is much of a spare time effort and all of us have enough other things on our hands, so that even Web 2.0 baseline commodities such as comments, trackback and pingback are still pending.
</p>
<p>
And although I generally agree with the <a href="http://seeknuance.com/2008/02/04/django-blogs-vs-wordpressorg-vs-wordpresscom/" title="Django blogs vs. WordPress.org. vs. WordPress.com">notion</a> that one should not always reinvent the wheel, we chose to develop this blogging application by ourselves. The main reason is that it is an easy way to get started with Django and thus with one of the more recent <abbr title="Model, View, Controller">MVC</abbr> frameworks. Another is that we wanted to be able to define relations and interactions among our four journals without limitations. And of course, there are some really nice Django applications that are more easily integrated with your own glue. There is also a more conceptual bonus which might be regarded as a disadvantage by some: <em>Every design decision that has to be done, we have to do.</em> There are lots of <em>defaults</em> in modern publishing applications, leading to a confluence in style and functionality where there could be diversity.
</p>
<p>
But, for that to happen here we still has some catching up to do. See you then!
</p>