My notes from PyCon 2014.

John Perry Barlow

Go back – couple of fun quotes

All Your Ducks In A Row: Data Structures, Brandon Rhodes

Further reading:

  • What your computer does while you wait
  • Misaligning Data Can Increase Performance 12x, Dan Luu
  • but python standard library doesn’t pay much attention to memory hierachies

Notes:

  • int is 24 bytes for 8 bytes of data, ref count, address of type, then data
  • struct module, array module
  • python special arracys, str, unicode, memoryview, etc
  • NumPy array for math, Python array for I/O/talking to C libraries
  • NumPy and Blaze
  • “All problems in CS can be solved by another level of indirection.” – David Wheeler
  • indirection: “Python: the language that gives you data structures w/o the actual data.”
  • tuples can do bounds checks! ;)
  • a thousand and a million, thousand: 3 0’s O(n), 4 0’s O(n ln n), 6 0’s O(x**2)
  • speed vs. space tradeoff, lists 94% populated on average, 6% waste
  • The Mighty Dictionary, PyCon 2010, Brandon Rhodes
  • list is about order, dict is about association – OrderedDict does both
  • __slots__ makes a class that uses a struct rather than a dict
  • bisect module – binary search
  • deque list which can modify both ends
  • heapq and PriorityQueue
  • generally we’ve outsourced trees to the data persistence layer
  • “The Clean Architecture in Python”, Brandon Rhodes, PyCon 2013

Cache me if you can,

  • review on video

Decorators: A Powerful Weapon in your Python Aresenal, Colton Myers, @basepi

  • SaltStack
  • decorators wrap functions
    • add functionality
    • modify behvaiour
    • setup/teardown
    • diagnostics
  • functions in Python are objects, first class functions
  • functions can create other functions
  • decorators are (usually) closures
  • @decorator is syntactic sugar:

      @decorate
      def myfunc():
          pass
    
      myfunc = decorate(myfunc)
    
  • Good decorators are versatile, use *args and **kwargs
  • myfunc.__name__ -> 'inner' – copy __name__ and __doc__
  • what about argspec, etc? wrapt, Graham Dumpleton

Porting from Python 2 to 3

  • Testing story is critical
  • Note port once, instead support 2 and 3 for some time
  • Straddling 2 and 3
  • use compatible subset of Python
  • conditional imports for stdlib changes
  • six can help but may not be necessary
  • Supporting less than Python 2.6 is hard due to syntax changes
  • 3.2 bare minimum for 3, 3.4 next LTS

Ansible

  • Missed due to Inder meeting, review on video

Death of JavaScript

  • Missed due to Inder meeting, review on video

Getting starting with machine learning

  • Good intro

Realtime analytics using Celery

  • SciKit-learn
  • Pika AMPQ library: http://pika.readthedocs.org/en/latest/

Distributed Computing Is Hard, Let’s Go Shopping

  • Fallacies of Distributed Computing, Peter Deutsch
  • LoggingAdapter
  • Testing
    • develop/test outside celery
    • test with a single worker/job
    • test with one work, multiple concurrent jobs
    • test with multiple servers
  • http://brolewis.com

an-in and Fan-out, Brett Slatkin

  • https://github.com/bslatkin/pycon2014
  • The Crucial Components of Concurrency
  • Why do we need Tulip?
  • Fan-out: when one theread of control spawns one or more new threads of control
  • Fan-in: when one thread of control gathers results from one or more threads of control
  • Building a web crawler the old way:
    1. fetch
    2. extract
    3. crawl (breadth first search)
  • The new way, using Tulip
  • It’s everywhere!
    • SQL: fan-out – join, fan-in – group by
    • MapReduce, fan-out – map, fan-in – reduce
    • Measurement: Histograms, Reservoir samplers, profilers, estimators
  • PEP3156 - asyncio
  • App Engine’s NDB library
  • C# async/await – ES7 generators
  • “Concurrency is not Parallelism”
  • @haxor, onebigfluke.com

Lightning Talks, Friday PM

  • Traversing the Montreal Metro with Python
    • https://github.com/leafstorm/montreal-metro
  • 12 yro Python programer
  • Scrape interactive: @yarkot
    • scrape.readthedocs.org
  • Pair programming the hackbright academy
    • 10 week full stack developer training for women
  • Teaching data science with Python
  • Structured logging
    • formatting, invisible to app code
    • context
    • structlog, wraps anything
    • https://github.com/hynek/structlog
  • Python .NET, John Gill
  • Certificate Based SSH, Bob VanZandt
    • https://github.com/cloudtools/ssh-ca
  • Physics of Bowling, Jack Diederich
  • DIY stuffed animals
    • blender, numpy and PIL
    • 3-D model
    • https://github.com/caretdashcaret/Patternfy
  • Positive Python: IRC channel
    • #positivepython

Lightning Talks, Saturday AM

  • peep
    • pip install peep
    • track hashes of downloaded packages
  • socialite – hadoop replacement in python?
    • http://mobisocial.stanford.edu/
  • Starting Simply / Gamify Health project
  • Speeding up EventBrite with one line:
    • re._MAXCACHE = 10000
  • Software Gardening
    • not much relationship to engineering
    • more like taking care of a living thing
    • hmm.
    • look it up

Python, the next generation – Jessica McKellar

  • http://jesstess.com
  • Ask CS to count as math or science credit in HS

Python & Science – Fernando Pérez

  • IPython
  • Amazon Machine image of data and analysis
  • Python for Signal Processing – on GitHub
  • ndbiff – diff and merge IPython notebooks

Track memory leaks in Python – Victor Stinner

  • https://github.com/haypo
  • import gc; gc.get_referents(data)
  • objgraph
  • heap fragmentation
  • pytracemalloc: http://pytracemalloc.readthedocs.org, only in 3.4, backports to 2.7 and 3.3

Designing Django’s Migrations – Andrew Godwin

  • Author of South, “Please Stop Using MySQL”, Django 1.7 migration
  • Original plan was to split parts into Django and then South 2
  • Really should be a core part of Django
  • Schema Editor and Migrations
  • SchemaEditor: abstracts schema operations, works in terms of django fields/models, per db workarounds
  • Migrations: migration file reader/writer, dependecy resolver, autodetector, applied/unapplied tracking
  • New migration format: simpler, introspectable, etc
  • PostgreSQL is fine, MySQL some problems, Oracle some problems, SQLite AAAAAHHHHHH!
  • Lessons learnt:
    • explicit is better than implicit
    • abstracting DBs is hard
    • composability rocks: eg. history squashing
    • Feedback is vital
  • http://aeracode.org
  • Django 1.7 release May 15 to end of May
  • future support for indicating if migration is blocking/non-blocking

Designing Poetic APIs – Erik Rose

  • Sapir-Whorf Hypothesis
  • “The limits of my language are the limits of my world.” - Wittgenstein
  • Abstracting out symbols is the root of all human language.
  1. Don’t be an architecture astronaut
    • The best libraries are extracted, not invented
  2. Consistency “Think like a wise man, but communicate in the language of the people.”
    • get(key, default) like dict, not fetch(default, key)
    • fail: need to reference your own documentation
    • fail: are you inventing novel syntax? is it necessary?
  3. Brevity
    • fail: copying chunks of code, probbaly too long
    • fail: always setting an attribute to the same thing – default
  4. Composibility
    • x84 BBS? https://github.com/jquast/x84
    • eg. separate ptinr and formatted
    • fail: classes with lots of state
    • fail: deep inheritance hierarchies
    • fail: violations of the law of Demeter
    • fail: mocking and tests, too much mocking, too many dependencies
    • testable code is decoupled code
  5. Plain Data
    • a method should take what it needs, no more – eg. file object not filename
    • fail: users immediately transforming your output
    • fail: instantiating one object
  6. Grooviness
    • Avoid nonsense representations
    • Fail shallowly
    • Resource acquisition is initalization: don’t have invariants that aren’t invariant
    • have compelling examples
    • fail: lack of clear starting point
    • fail: long/complicated documentation
  7. Safety / Walls
    • put barriers proportional to potential damage
    • Exceptions > Return Values
    • fail: docs that say “remember to…” or “make sure that…”
    • fail: what’s the most dangerous thing this can do here, put barking dogs nearby
  • Two clusters: lingual and mathematical
  • https://github.com/erikrose
  • MOAR:
    • Making Software: What really works and why
    • RESTful Web APIs

Writing RESTful web services with Flask – Miguel Grinberg

  • Missed due to lunch going long

Data intensive biology in the cloud: instrumenting ALL the things – Titus Brown

Discovering Python – David Beazley

  • Using Python to analyze 1.5TB of source code for a lawsuit

Python Packaging

  • Partial, headed to Web testing

Advanced techniques for Web functional testing – Julien Phalip

  • Selenium
  • Can be used to test responsive sites
  • needle checking CSS styles: https://github.com/bfirsh/needle
  • SauceLabs
  • http://davehaeffner.com/selenium-guidebook/
  • http://julienphalip.com

Performance Testing and Profiling: A Virtuous Cycle – Dan Crosta in AB

  • http://late.am/
  • talke about stress test and load testing

Lightning Talks Saturday PM

  • Pandas http://pandas.pydata.org @sarah_guido
  • Documenting history, how to write great commit messages
    • Greg Ward #gergdocca
    • why use VCS? memory & collaboration
    • why you made that change?
    • tell me why (and what if you like)
      1. tell me what you did and why
      2. brevity is the soul of wit
      3. pick a style and stick with it
      4. pick a grammatical mode and stick with it
      5. spelling counts (so does grammar and punctuation)
      6. teamwork counts TELL ME WHY
  • dh-virtualenv
    • https://github.com/spotify/dh-virtualenv
    • marriage of deb packages and virtualenv
  • Larry Hastings, Things You Didn’t Know
    • http://larryhastings.com/
    • shlex.split() – unquoute shell
    • liskov violation: liskov substitution principle / rectangle/square problem
    • inconsistent submodules: import all leaves in init.py
  • Machine Learning in Minutes
    • Identification trees
    • Reasoning vs.
  • Bleeding edge packages: for users and developers alike
    • bep – bleeding edge packages
    • goog.gl/x7N8x3
    • https://github.com/jgors/
  • Home brewing with Python
  • Distributing your Python Game
  • mpld3: Matplotlib and D3
    • http://mpld3.github.io
    • plugins
  • Server Security 101

Lightning Talks Sunday AM

  • RaspberryPi
  • fail2ban
  • contentd
  • How MatPlotLib made my robot not suck
  • git-lockup
    • https://github.com/warner/git-lockup
    • also, https://github.com/warner/python-versioneer
  • python and natural language learning

Van Lindbergh

  • distinguished service award: raymond hettinger
  • community service award: r. david murray
  • benjamin peterson
  • open PSF membership

Guido van Rossum

  • No 2.8
  • #positivepython
  • geekdom
  • exception expressions
  • Kivy, python on mobile devices: site
  • Python 3.4 is awesome, NumPy is close, Django and Flask aren’t quite there yet
  • The worst part of contributing to cpython? very hard to get started
  • Python 3 adoption
  • Scientific/math support in core python?
  • What kinds of things make you uncertain/insecure?

Talks I did not see, but looked good:

  • Would like to see: Upgrade your Web Development Toolchain
  • A Scenic Drive through the Django Request-Response Cycle
  • Django: The good parts
  • Pushing Python: Building a High Throughput, Low Latency System – Kevin Ballard
  • In Depth PDB – Nathan Yergler
  • It’s Dangerous to Go Alone: Battling the Invisible Monsters in Tech – Julie Pagano
  • What is coming in Python packaging – Noah Kantrowitz
  • Fast Python, Slow Python – Alex Gaynor
  • Which messaging layer should you use if you want to build a loosely coupled distributed Python app? – Narahari Allamraju