Notes on Europython 2015

I've just been to my first Europython, the biggest Python conference in Europe. This year's edition took place in the wonderful city of Bilbao. As I did regarding Craft Conf 2015, I'll share some thoughts about the talks I've followed and the things I've learned.

You can find the full schedule here. The talks have also been recorded, so I expect them to show up on YouTube pretty soon (if they haven't shown up already). I'll only list the talks' titles - you can find the names of the speakers in the schedule.

Keynote: It's Dangerous To Go Alone, Take This: The Power of a Community

This was the opening talk of the conference. It was a beautifully illustrated narrative about how programming communities are not diverse enough and might seem/be daunting to newcomers. This intro led to the introduction of django girls - the talk being given by the two co-founders of the project. There were also some numbers on django girls since its inception and, suffice to say, it's amazing what the community has achieved - tons of workshops, mentors and a great tutorial. And now they've also started working on a book, called yay python.
I highly recommend the django girls tutorial for any non-technical person that wants to start learning about programming.

Asyncio Stack & React.js or Development on the Edge

I have to admit I was expecting this talk to be full of cool demos and whatnot. Unfortunately, it was merely a quick list of ES6 features and a concise intro to asyncio.
The substance of the talk was, I think, the speaker's claim that ES6 and React have finally made the frontend work appealing to the average Django developer.

The Lightweight Cloud Servers War Begins

Another overview kind of talk. Some bullets about containers, and container-targeted operating systems (CoreOS, Snappy, Rancher OS). Found out about systemd socket activated containers, a really cool feature. A few words on clustering, kubernetes and the fact that this is still an emerging model and that things change fast.

Everyone can do Data Science in Python

The talk was given by someone at import.io, a pretty cool web-scraping based data acquisition service (whoops, so many words to explain what they do). It was basically a live demo using ipython on how to work with data using Python.
A few pro tips as well - be aware of where from and how you got the data. Something in the DAQ process will affect all the operations you will later perform and all the insights you will extract. Also, most of the time is spent in cleaning data and performing exploratory data analysis. Actually running the ML algorithm(s) is just a tiny bit.

If there is no action as a consequence of your analysis or predictions, you've failed!

The Python Compiler

Nuitka is an alternative Python compiler that prides itself with being highly compatible. It works on all major platforms and it is already faster than CPython, which is pretty amazing considering the project seems pretty much a one man job. The main takeaway was the fact that the project's author took his time - he is never rushed and chooses to do the right thing (even if it might take longer).
From an architectural point of view, the tool parses Python code, performs some transformations, emits C++ code and then it compiles it. The author started with C++11, but ended up with C-style C++ code and he intends to actually get to C99.
Pretty cool!

Metrics-driven development

This was about how a team at Spotify started using metrics to improve their work and workflow.

If it moves, you can track it.

Some cool numbers:

spotify numbers

Python & Internet of Things

Learned about an iPhone case that has some health tracking tricks. Just some code snippets of using Python to access embedded hardware.

Python Multithreading and Multiprocessing: Concurrency and Parallelism

Full house, room was packed. Unfortunately it was just a run through the stdlib tools like condition vars, semaphores and so on. Just bullets with what they do - CS 101.

Lightning talks

I learned about mutation testing from the author of cosmic ray. It's the next level of testing: testing your tests! This tool parses your code and randomly mutates parts of the AST (negates conditionals, removes branches etc.) and checks that your tests catch these changes (i.e. they fail). So meta!

Instructions is a library for finding stuff in iterables without having to re-write those list comprehensions over and over again.

Someone showed how you can use gdb to learn about Python's internals. E.g. set a breakpoint in listappend and from the python interpreter call [].append(5).

Another talk shared some protips on processing strings. Rule number 0 was use an existing library. Rule number 1 was if in doubt, use regexps. Reminds me of this.

Pygame zero is a great way for beginners to make games using Python.

Keynote: Python now and in the future

This was Guido's keynote. Excitement++. He'd like you to switch to Py3. It's not a demand. It's all about opportunity cost. Overall, Py3 is a better language; also better for teaching. The type hints PEP was hard to get accepted. He made a joke that being the BDFL, he could've accepted it himself.

Guido took questions from the audience:

  • favorite features of Python 3: async def, async for, type hints, matrix multiplication operator (@)
  • on Python having many open bugs: it's an open source, volunteers don't necessarily want to work on the "ugly stuff"; also plenty bugs that are hard to reproduce
  • what he hates about Python: packaging and distribution
  • favorite exception: KeyboardInterrupt

I enjoyed his keynote, it was pretty down to earth and gave a fair status of the Python landscape.

PySpark and Warcraft Data

Spark demo on some World of Warcraft auction house data. Really cool, pyspark seems really easy to use. The speaker made the argument that Spark is considerably cheaper than Hadoop, considering you can store your data in S3 and just turn on your cluster only while you actually need it.

wow data

queries

Building mobile APIs with services at Yelp

This was the first of the two Yelp talks. Mainly focused on the Yelp deployment process in the context of services and mobile APIs.

yelp

They test their service oriented architecture using docker-compose. While this approach is nice since they test against all services running in a production-like environment, the process is pretty heavyweight and it grows with the number of called services.

Important point about mobile APIs - it is not web development. Some users never upgrade their apps, so the APIs always have to be backwards compatible.

Image recognition and camera positioning with OpenCV. A tourist guide application.

I was hoping for some cool demos. Only found plenty math:

math

Keynote: Towards a more effective, decentralized web

shame

Some words on the state of the Web today. A big centralized system with a few main nodes (Google, Facebook etc.). The speaker made the point that such a system wouldn't scale when we'll get to Mars (Elon Musk + SpaceX references here).
Also went through the building blocks of a potential distributed web - bittorrent, git, DHTs. Finally, he introduced IPFS as a potential solution for achieving a more decentralized Web. IPFS is really interesting, I suggest you take a look!

Data Analysis and Map-Reduce with mongoDB and pymongo

This was similar to the WoW PySpark talk mentioned above, but it used Mongo's aggregation framework on iTunes data.

mongo

  • the aggregation framework seemed really easy to use from Python
  • WiredTiger's compression is impressive
  • the speaker learned a lot about Taylor Swift and Katy Perry
Type Hints for Python 3.5

This was not a keynote but it sure felt like one. Full house in the biggest room of the venue. Guido went through the history of the type hints proposal (starting from 2000, FYI). It seems everything started to crystallize as soon as mypy came about.
The idea is to allow Python code to be annotated and then devs can run mypy to check for type errors. As expected with gradual typing, parts of the code that have no type annotations are considered OK. There are no plans for optimizing the generated code based on the type hints - it is only about safety. Good news - tools such as PyCharm will switch to using the standard PEP type hints instead of their own.
There are some "hacks" in place - for example Any, which is both a super- and a subclass of everything.

any

Here's how one would annotate a generic function:

generics

I'd argue that the snippet above would make both a Haskell and a Python programmer cry, but hey, it's the best they could do while being backward compatible.

Designing a scalable and distributed application

Great demo showing Consul - used both for service discovery and as a distributed datastore, across two datacenters. Slides here.

Come to the Dark Side! We have a whole bunch of Cookiecutters!

Cookiecutter helps start projects from templates. There was a demo of creating a kivy app. Seemed easy enough, and the UI looked native (on OS X).

NumPy: vectorize your brain

Quick intro to numpy. The speaker is a developer at JetBrains and also a teacher. She showed some of the common mistakes her students make while implementing algorithms in pure Python instead of using numpy. Also demoed some advanced ipython magic (e.g. compiling and running C code within ipython).

gitfs - building a filesystem in Python

gitfs is a nice tool built by a Romanian startup. They host Wordpress sites and need a way to make versioning usable to non technical people (content authors). What they ended up doing is using git through FUSE in order to provide a magical filesystem where everything is versioned and non technical people can go back in time by simply exploring the folder tree. Really interesting!

What it's really like building RESTful APIs with Django

The folks at lyst used to have a single RPC endpoint for their API. This made it hard for developing against it as well as caching.

RPC for HTTP is bad.

Great insight: design APIs by writing documentation first. Resources are not necessarily the same as models.

They used the above tips when rewriting their API using Django REST Framework. For achieving the model-resource separation, they turned to serializers.

Python's Infamous GIL

Larry Hastings talked about the history of the global interpreter lock in Python. How it came to be, why it can't be removed. He explained things in a very beginner-friendly way (simple example of how refcount without the GIL could go bananas).

Keynote: So, I have all these Docker containers, now what?

This keynote was given by a Google engineer. The first part was about Borg, which is some crazy smart technology for managing clusters and executing code. The second part was about kubernetes, a lighter, open source, version of Borg aimed at Linux containers.

The general idea is that if you pack all your processes in containers, you can then easily and efficiently scale by using a tool like Kubernetes. Because everything is containerized, resource usage can be monitored and containers can be distributed on different machines in a way that makes sense (e.g. place a CPU intensive, low memory container near a low CPU, memory intensive container).

Arrested Development - surviving the awkward adolescence of a microservices-based application

This was the 2nd Yelp talk, on the issues they had while and since migrating to microservices. Most people only talk about the bright side, the Yelp guys were honest enough to talk about the problems as well.

  • API complexity increased
  • coupling came back (services had to know how to call other services)
  • interactions between services are harder to keep track of (different processes, different machines etc.)

Protips: use something like Swagger - writing a thorough Swagger is lots of work but it sure is worth it. You get automatically generated client SDKs and clear documentation. Writing the spec makes developers think about their API. Everything becomes intentional, explicit. Another tip - use something like the ELK stack. Going through logs when you have tens of services on tens of machines is not easy. Switching to ELK made it easier for them to be aware of issues and debug them. Also using elastalert for alerts on top of elasticsearch.

How you can benefit from type hints

This talk was from another Jetbrains developer, on how the new type hints can benefit tooling. Learned about stub files. TL;DR autocompletion and code insights in PyCharm will be even better once type hints become used.

Preparing Apps for Dynamic Scaling

This was mostly a sales pitch talk for a service that offers dynamic vertical scaling. The main point was that horizontal scaling takes work because you need distributed data (e.g. sessions for users). Slides.

Tuning Python applications can dramatically increase performance

The talk was given by an Intel engineer and it started with a wall of text legal disclaimer. #corporateWorldFTW

There was a comparison of different kinds of profilers (cprofile, line-based and sampling). Then, the speaker showed a demo of a working prototype for a sampling Python profiler in vTune.

The demo profiled two functions that created strings - one kept concatenating, the other one used a mutable buffer. What was interesting was that under the line profiler, the slow code (concatenating) ended up being quicker than the faster code (mutable buffer). Turns out overhead is really important when profiling.

PyPy and the future of the Python ecosystem

PyPy is an alternative Python implementation that is really fast, thanks to its JIT. The speaker is a 4 year contributor.

Dynamic languages are getting faster and faster (alternative) implementations. A bit of trolling:

If PHP can do it, so can we!

Again, an explanation on why CPython can't be changed. It's all about the C extensions which have low level access into the interpreter's memory management. And C extensions are really important to the Python community.

To kind of have the best of both worlds, the speaker developed pymetabiosis, a bridge which allows importing CPython modules within PyPy. Really meta, and it works! There was a demo of plotting stuff using matplotlib from PyPy.

Static type-checking is dead, long live static type-checking in Python!

In case you couldn't tell by now, type hints and friends were a big subject at this year's EuroPython. This talk showed a bit of how static type checking works and how tools such as QuickCheck can be used to make Python code safer.

Found out about Quark, a web browser with a formally verified kernel.

Lightning talks

Swamp dragon based demo similar to Twitch plays Pokemon. Left side of the room vs right side of the room.

Conda 101.

Bcolz - data storage faster than memory. Turns out on certain CPUs, compressing data makes things faster, because decompression is quicker than transferring more data. Who would've thought?!

Deepish learning - using computer vision and deep learning to read emotions from pictures and then use those emotions in Python conditionals. E.g. if sad(): stuff.

Making super safe: cooper, a Python library.

Ending talk - some numbers: 1094 attendees. 170 talks (tons of tracks).

Bonus: Google SRE Classroom

This was an optional workshop that you had to apply for. We split in ~7 person teams and each team had a "mentor" from the Google SRE team. We had to solve a system design problem - given some infrastructure and a set of constraints, propose a solution for a given problem. There was no coding involved, just back of the envelope calculations and architecture ideas. It was definitely challenging and pretty fun!

Summing up

There's a lot of stuff happening around Python 3. Asyncio and type hints are the main things.

Python is a great language for education and for getting people into programming. The Django girls tutorial is really awesome.

C extensions make it such that CPython will probably never get over the GIL and its inferior threading model. A lot of great work is being done around alternative implementations such as Nuitka and PyPy, but until C extensions are ported, they will only be targeting a niche market.

That being said, Python is in no way "too slow". It is simply a matter of using the right tool for the right job. For example, doing IO is just fine. If you need to crunch some numbers, turn to something like NumPy and you'll be fine.


That is all, sorry for the MVP-ish structure of the text. If you are curious to get more details about any of the above, feel free to ping me on Twitter.

Mihnea Dobrescu-Balaur

Building @Hootsuite Analytics. Interested in Open Source, programming languages and the Open Web. Big fan of basketball and photography. Tweet @mihneadb.

Bucharest, Romania

Subscribe to Mihnea DB

Get the latest posts delivered right to your inbox.

or subscribe via RSS with Feedly!