Farm Development

PyPi (Cheeseshop) on Google App Engine

Like many of you, I've had my jaw on the floor since the release of Google App Engine. Although there are skeptics out there, a careful read of their terms will show you that it's for real — Google has released GOOGLE to the world and it's not for scary marketing purposes. In fact, I've been growing tired of paranoid Google haters; I'm hoping this will shut them up for a while.

Why is App Engine such a breakthrough? The concept of a hosted web application is nothing new but it has never been done this well. Mundane server maintenance? Gone. Infinite scalability? Check. 100% uptime? Let's face it, if Google went offline you'd probably be down in a nuclear bunker playing Parcheesi.

So ... how should we leverage this tool for the greater good of the community? I can't count the ways without getting dizzy. How about let's start with a mirror of PyPi, the Python Package Index?

PyPi on the App Engine

I barely spent two days on it, but here it is: http://pypi.appspot.com/. Test it out, play with it, try to break it.

As Python grows, especially due to App Engine, PyPi needs to scale too. Zope has put together a PyPi mirror but that's the only other one I know of (actually, I can't even find the link to it right now). Coincidentally, PyPi even went offline for a few min while I was writing this blog post.

Issues...

You Can Help

I'm not dedicated to this project, I just thought it sounded like a good idea and would be a fun way to experiment with the App Engine. If anyone is interested in working on it just let me know --kumar.mcmillan@gmail.com. If there is enough interest I'll put it on Google Code. Possibly the most exciting feature of App Engine is the Datastore API (aka BigTable) and Ben Bangert agrees. It's a little hard for me to wrap my head around it but so far the Expando class—besides being the coolest name for a class—seems to work great for storing package data. If EGG-INO grows a new parameter, it just gets tacked on to the row dynamically.

This has also been a great way to dig up bugs, some of which have already been fixed.

  • Re: PyPi (Cheeseshop) on Google App Engine

    While GAE might be cool, it is also a completely new platform, to which applications must be ported. Also, I like to have control over the environment where I run my apps.

    As for PyPI, yes we need mirrors, but sometimes the centralized approach of PyPI isn't the right solution for all needs. For example, if your app relies on many third-party packages, you have no control over their PyPI pages and the packages and download links they make available. This means that your installation with easy_install can break at any time when some of your dependencies get updated. Thats why sometimes you need to maintain your own package index. I have written a small PyPI server with the TurboGears framework, called EggBasket (http://chrisarndt.de/projects/EggBasket). Check it out! (It doesn't run on GAE, though ;-)

  • Re: PyPi (Cheeseshop) on Google App Engine

    Christopher, many thanks for the EggBasket! I owe a lot to your code since it provided me witha good starting point. At first thought I could write the appengine version on top of yours, but I ran into many snags where EggBasket wanted to work with the file system. I.E. for all the file handling, I have to use StringIO buffers and this required fiddly changes everywhere. Also, yolk has given me some great ideas, thanks for that too ;)

    I think EggBasket solves the problem of hosting a private egg index very nicely. What I'm doing is no way a replacement for that. For example, my company uses eggs for all Python deployment and most of these are not open source (trust me, you don't want them). For this we use a private repository that easy_install can access while we are on our intranet. In fact, after seeing EggBasket the other day I plan to migrate our home-grown server to EggBasket since ours doesn't support the upload command.

    I see this PyPi mirror more as a way to achieve better redundancy for publically available eggs.

  • Re: PyPi (Cheeseshop) on Google App Engine

    100% uptime ? Don't count on it. I have a few google groups going, which is an application with far more operational constraints than GAE, and those certainly go down every few months, usually for half a day or so the group will be "temporarily unavailable". When they go down, there is *nobody* to complain to, either...you just have to wait it out and hope they realize something is broke.

    So GAE may be paradigm shifting and all that, but for more serious host consumers like me I can't see what Google could do to make GAE more appealing than a straight VPS where I can run whatever sofware I want without restriction, still have no hardware issues, and get guaranteed service and portability.

  • Re: PyPi (Cheeseshop) on Google App Engine

    ok, ok, 98% uptime ;)

    for me, App Engine offers invisible hardware and a guaranteed set of dependencies. I like that. They have HUGE incentive to make it "just work". If it just works then I don't have to spend time upgrading the OS on my VPS, updating other libs for bugfixes, troubleshooting why the VPS is suddenly slow, or whatever.

    But, of course, we all know that "just works" is always a lie so it has yet to be determined how reliable App Engine will be. Only time will tell.

  • Re: PyPi (Cheeseshop) on Google App Engine

    Kumar, glad that you like EggBasket. BTW, I released version 0.4a a few hours ago (Changelog on the website). Yes, EggBasket relies very much on the file system because I wanted to avoid maintaining the packages in a database. You should be able to just point it at a directory with packages and be good to go. And add packages by just copying them to the right places (for example with scp). At the moment each package needs its own subdirectory, but I might implement scanning for packages in the top-level directory as well (but then caching would have to be implemented first).

    For intranets and closed-source packages, I should implement options to protect package listings and downloads with a login, but in TurboGears there is not straightforward to restrict access to a group and at the same time make it an option to allow access to anonymous users as well. It's not a feature I need right now, but probably will soon. So stay tuned...

  • Re: PyPi (Cheeseshop) on Google App Engine

    For large downloads, you could use Range requests to fetch pieces, with a maximum size (I think if you give an over-expansive range, the server should just ignore it -- so you could send your first request with the maximum size range you can handle, and get back whatever you get).

    And hey, WebOb has Range support! You'd just need to make a WSGI proxy for urlfetch. I bet such a proxy would be really easy to write.

Note: HTML tags will be stripped. Hit enter twice for a new paragraph.

Recent Projects

  • JSTestNet

    Like botnet but for JS tests in CI.

  • Nose Nicedots

    Nose plugin that prints nicer dots.

  • Fudge

    Mock objects for testing.

  • Fixture

    Loading and referencing test data.

  • NoseJS

    Nose plugin that runs JavaScript tests for a Python project.

  • Wikir

    converts reST to various Wiki formats.