Do you like Python? Do you like games? Are you a fan of autonomously controlled tanks? Then PyBot Arena might be just the thing for you!
PyBot Arena is a game engine where participants implement algorithms - in Python of course - for controlling tanks which are then pitted against each other. I created the game for the Python NZ Christchurch meetup as a fun way to end the year, with creative and hilarious results.
Instructions for running the game and creating your own bots are in the README. The various bots created by the folks at the meetup are checked in to the repository for your inspiration and enjoyment.
Please let me know if you create your own bots or use the game at a meetups. I'd love to see what people come up with.
There's also plenty of ways the game engine could be improved. I've documented some of them as issues on GitHub. Contributions are very welcome.
Many thanks to the author of the Hunter2 game on GitHub. This was the starting point for PyBot Arena.
It's been a month of meetup talks for me. The third and final one was about Python's early history starting right from when Guido first started working on it in December 1989 until around verson 2.3 when Python was really taking off. Inspirations for the language and various key features are discussed and an obligatory photo of a very young Guido Van Rossum is included too!
This talk was given at the October 2024 Christchurch Python meetup and the slides are available here.
Two out of three talks this month done!
I was honoured to be one of the speakers at the From virtual envs to my reality session of the Python NZ Online meetup this evening. The session had four speakers covering various approaches to managing Python versions, virtual environments and project dependencies.
My talk was about Poetry, positioning it within the wider ecosystem and discussing some of its strengths and weaknesses. The slides are now available here.
I gave a presentation today at the Christchurch Rust meetup that introduced the widely used Axum web framework. The talk covered what Axum is, what it isn't, and some common usage patterns. Brian Thorne followed with an excellent talk about logging, tracing and metrics in Rust. Thanks to everyone who attended for your attention, questions and discussion.
The slides are available here. Example code and source for the slides are on GitHub.
Some time ago, I was looking for an idea for a small Rust project to help me get some practical experience with WebAssembly (wasm). I wanted to do something fun and whimsical and when I came across the Rust noise crate I knew I was on to something.
After much learning, iteration and procrastination, I present to you Faux Maps, a generator of random, cutesy maps.
Through the magic of WebAssembly, all the work of generating each map is done inside your browser. The web server hosting Faux Maps is not involved once the page is loaded.
I can take little credit for the actual map generation algorthm - that's all handled by the noise crate. I only selected some suitable generation parameters and colours, and wrapped it up into a WebAssembly build. The project is simple enough to be a clean reference for how to put together a Rust based WebAssembly project. I started with the excellent rust-webpack-template repo and simplified all aspects as much as possible.
I'd love to add some dynamic features to the maps such as drifting clouds and flocks of birds. These could be done as separate layers which sit above the base map image. Maybe some rainy afternoon...
I've recently become quite enamoured with Logseq. It's a modern, open source notes app and outliner with some unique and powerful features. At home I use it to keep track of personal notes and all kinds of random data, and at work it's become my personal knowledge dump and task management system. Logseq is very flexible and capable out of the box but it also has a plugin mechanism that allows you to extend its core functionality.
At work, I find it's useful to know when I've completed tasks, especially when preparing for standups and other meetings, and this is something that Logseq doesn't do out of the box. Two plugins plugins already existed which provide this kind of functionality but I wasn't happy with either of them. One stored the completion timestamps in an inconvenient format that is difficult to use with Logseq's query functionality and the other introduces a UI popup every time a task is completed which got in my way.
So, I created the DONE Time plugin. It's currently very simple with no UI or
settings, and just adds a done-ms
property completed task blocks. As the name
indicates, the timestamp is in milliseconds (in Unix time), making it easy to
do precise filtering and sorting. The done-ms
property is removed if a task's
state changes from DONE to some other state.
The timestamps aren't nice to look at so I hide them using the
:block-hidden-properties
option in config.edn
. I plan to explore whether
it's possible to apply custom rendering of specific properties so that the
timestamps can be displayed in a more human friendly way.
I hope that this plugin is of use to some Logseqers out there. It can be found in the Logseq Marketplace inside the app.
I gave a talk tonight at the Christchurch Python meetup titled "Security For Python Developers". It touched on general security issues from a Python perspective such as typo-squatting, and very Python specific issues such as the dark side of the pickle.
The slides are available here.
Thanks to everyone who came along and for the interesting digressions during and after the talk.
I have an Intel NUC that I've been using for monitoring and backups on my home network. It lives in the corner of a living room and the noise of the fan has been surprisingly and annoyingly loud, even when the machine is idle.
On other machines, I've had success using fancontrol to control fan speeds using hardware temperature sensors on the CPU and motherboard so I tried that first. It turns out the NUC has no software controllable fans so that was a non-starter.
After some research I found that the BIOS in the NUC is actually quite sophisticated and offers a number of options to control cooling and BIOS.
Here's what I ended up tweaking:
After saving these settings and rebooting the NUC the fan noise was almost inaudible yet temperatures were well within the acceptable range. These changes should hopefully shave a little off the power bill too.
IMAPClient 3.0.0 is out!
The big news here is that support for Python 2 - and older Python 3 versions - has been dropped. IMAPClient now support Python 3.7 and later. This allows for the removal of various compatibility shims, expands the set of Python language features that can be used and reduces testing load.
Dropping support for older Python versions has also opened up the possibility for type hints to be added throughout the code base. This is an ongoing effort although much has already been done.
With the focus on modernizing the project, there aren't many bug fixes or new features in this release. Upcoming releases will include fixes for various bugs around IDLE amongst other things.
Many thanks to everyone who contributed to this release, especially John Villalovos for his efforts around modernizing the code base and project infrastructure.
Note that release notes are no longer maintained in the project documentation and are instead automatically generated as part of each GitHub release. This takes advantage of GitHub's automatic release notes generation feature and uses PR labels to categorise changes.
As always, IMAPClient can be installed using pip (pip install imapclient
). Full documentation is available at Read the Docs. Any bugs or feature requests should be files as GitHub Issues and any questions can be asked using GitHub Discussions.
I gave a presentation today for the Christchurch Python meetup that talks about why pytest is so such a fantastic testing framework and how to approach common testing tasks and problems with it. Some potential areas of confusion around pytest's "magic" were raised by the audience which led to some interesting unprompted discussions.
The slides are available here.
I gave a talk this week for the Christchurch Python
meetup that went into how
iteration works in Python in some detail. Iterators, iterables, generator
functions, list expressions and the various language protocols involved with
iteration in Python were all covered. The while
loop even gets brief mention.
The slides are available here.
Mike Kittridge and I ran a talk last week for the Christchurch Python meetup that covered a large assortment of ways that Python programs can store data and exchange data with other programs. We covered the ins and outs of binary files, text files, buffering, text encoding, newline ending, mmap, pickle, JSON and friends, MessagePack, Protobuf, tabular data formats, multi-dimensional data formats and key-value stores. It was quite the tour!
The slides are available here:
The release notes for Go 1.17 mention
an interesting change in the Go compiler:
function arguments and return values will now be passed using registers
instead of the stack
. The
proposal document
for the feature mentions an expected 5-10% throughput improvement across a
range of applications
which is significant, especially for no effort on the
developers part aside from recompiling with a newer version of the Go compiler.
I was curious to see what this actually looks like and decided to take a deeper
look. This will get more than a little nerdy so buckle up!
Note that although the catalyst for this blog article was a change in Go, much of this article should be of interest generally even if you don't use Go.
At this point it's helpful to remind ourselves of what CPU registers are. In a nutshell, they are small amounts of high speed temporary memory built into the processor. Each register has a name and stores one word of data each - this is 64 bits on almost all modern computers.
Some registers are general purpose while others have specific functions. In this article you'll come across the AX, BX and CX general purpose registers, as well as the SP (stack pointer) register which is special purpose.
It's also useful to remind ourselves what the stack is in a computer program. It's a chunk of memory that's placed at the top of a program's memory. The stack is typically used store local variables, function arguments, function return values and function return addresses. The stack grows downwards as items are added to it.
Eli Bendersky has an excellent article about how stacks work which contains this helpful diagram:
When items are added to the stack we say that they are pushed onto the stack. When items are removed from the stack we say that they are popped off the stack. There are x86 CPU instructions for pushing and popping data onto and off the stack.
The SP register mentioned earlier points to the item currently at the top of the stack.
Note that I'm taking some liberties here. The stack works like this on x86 computers (and many other CPU architectures) but not on all of them.
In compiled software, when some code wants to call a function, the arguments for that function need to somehow be passed to the function (and the return values need to be passed back somehow when the function completes). There are different agreed-upon ways to do this and each style of passing arguments and return values around is a "calling convention".
The part of the Go 1.17 release notes quoted above is really about a change in Go's calling conventions.
This is all hidden from you unless you're programming in assembler or are trying to make bits of code written in different programming languages work together. Even so, it's still interesting to see how the machinery works under the hood.
In order to compare the code the Go compiler generates in 1.16 vs 1.17 we need a simple test program. It doesn't have to do much, just call a function that takes a couple of arguments which then returns a value. Here's the trivial program I came up with:
package main
import "fmt"
func add(i, j int) int {
return i + j
}
func main() {
z := add(22, 33)
fmt.Println(z)
}
In order to see the CPU instructions being generated by the Go compiler we need
a disassembler. One tool that can do this is venerable
objdump which comes with the GNU
binutils suite and may already be installed if you're running
Linux. I'll be using objdump
in this article.
go tool objdump
command.
It's tempting to use this output for our exploration here but this intermediate assembly language isn't necessarily a direct representation of the machine code that will be generated for a given platform. For this reason I've chosen to stick with objdump.
Let's take a look at the output from Go 1.16 which we expect to be using stack based calling. First let's build the binary using Go 1.16 and make sure it works:
$ go1.16.10 build -o prog-116 ./main.go
$ ./prog-116
55
Great! Now lets disassemble it to see the generated instructions:
$ objdump -d prog-116 > prog-116.asm
The first thing I noticed is that there's quite of lot of code:
$ wc -l prog-116.asm
164670 prog-116.asm
That's a lot of instructions for such a small program but this is because every Go program includes the Go runtime which is a non-trivial amount of software for scheduling goroutines and providing all the conveniences we expect as Go developers. Fortunately for us, the instructions directly relating to the code in our test program are right at the bottom:
(I'm omitting the offsets and raw bytes that objdump normally provides for clarity; also some of Go's setup code)
0000000000497640 <main.main>:
...
movq $0x37,(%rsp)
call 40a3e0 <runtime.convT64>
mov 0x8(%rsp),%rax
xorps %xmm0,%xmm0
movups %xmm0,0x40(%rsp)
lea 0xaa7e(%rip),%rcx # 4a2100 <type.*+0xa100>
mov %rcx,0x40(%rsp)
mov %rax,0x48(%rsp)
mov 0xb345d(%rip),%rax # 54aaf0 <os.Stdout>
lea 0x4290e(%rip),%rcx # 4d9fa8 <go.itab.*os.File,io.Writer>
mov %rcx,(%rsp)
mov %rax,0x8(%rsp)
lea 0x40(%rsp),%rax
mov %rax,0x10(%rsp)
movq $0x1,0x18(%rsp)
movq $0x1,0x20(%rsp)
nop
call 491140 <fmt.Fprintln>
...
Weird! This doesn't look like our code at all. Where's the call to our add
function? In fact, movq $0x37,(%rsp)
(move the value 0x37 to memory
location pointed to by the stack pointer register) looks super suspicious. 22 +
33 = 55 which is 0x37 in hex. It looks like the Go compiler has optimised the
code, working out the addition at compile time, eliminating most of our code in
the process!
In order to study this further we need to tell the Go compiler to not
inline the add function which can be
done using a special comment to annotate the add function. add
now looks like this:
//go:noinline
func add(i, j int) int {
return i + j
}
Compiling the code and running objdump
again, the disassembly looks more as
we might expect. Let's start with main()
- I've broken up the disassembly
into pieces and added commentary.
The main func begins with base pointer and stack pointer initialisation:
0000000000497660 <main.main>:
mov %fs:0xfffffffffffffff8,%rcx
cmp 0x10(%rcx),%rsp
jbe 497705 <main.main+0xa5>
sub $0x58,%rsp
mov %rbp,0x50(%rsp)
lea 0x50(%rsp),%rbp
followed by,
movq $0x16,(%rsp)
movq $0x21,0x8(%rsp)
Here we see the arguments to add
being pushed onto the stack in preparation
for the function call. 0x16 (22) is moved to where the stack pointer is
pointing. 0x21 (33) is copied to 8 bytes after where the stack pointer is
pointing (so earlier in the stack). The offset of 8 is important because we're
dealing with 64-bit (8 byte) integers. An 8 byte offset means the 33 is placed
on the stack directly after the 22.
call 4976e0 <main.add>
mov 0x10(%rsp),%rax
mov %rax,0x30(%rsp)
Here's where the add function actually gets called. When the call
instruction
is executed by the CPU, the current value of the instruction pointer is pushed
to the stack and execution jumps to the add
function. Once add
returns,
execution continues here where z
(stack pointer + 0x30 as it turns out) is
assigned to the returned value (stack pointer + 0x10). The AX register is used
as temporary storage when moving the return value.
There's more code that follows in main to handle the call the fmt.Println but that's outside the scope of this article.
One thing I found interesting looking at this code is that the classic push
instructions aren't being used to add values onto the stack. Values are placed
onto the stack using mov
. It turns out that this is for performance reasons.
A mov
generally requires fewer CPU cycles than a push
.
We should also have a look at add
:
0000000000497640 <main.add>:
mov 0x10(%rsp),%rax
mov 0x8(%rsp),%rcx
add %rcx,%rax
mov %rax,0x18(%rsp)
ret
The second argument (at SP + 0x10) is copied to the AX register, and the first
argument (at SP + 0x08) is copied to the CX register. But hang on, weren't the
arguments at SP and SP+0x10? They were but when the call
instruction was
executed, the instruction pointer was pushed to the stack which means the stack
pointer had to be decremented to make room for it - this means the offsets to
the arguments have to be adjusted to account for this.
The add
instruction is easy enough to understand. Here CX and AX are added
(with the result left in AX). The result is then pushed to the return location
(SP + 0x18).
The ret
(return) instruction grabs the return address off the stack and
starts execution just after the call
in main
.
Phew! That's a lot of code for a simple program. Although it's useful to understand what's going on, be thankful that we don't have to write assembly language very often these days!
Now let's take a look at the same program compiled with Go 1.17. The compilation and disassembly steps are similar to Go 1.16:
$ go build -o prog-117 ./main.go
$ objdump -d prog-117 > prog-117.asm
The main disassembly starts the same as under Go 1.16 but - as expected -
differs in the call to add
:
mov $0x16,%eax
mov $0x21,%ebx
xchg %ax,%ax
call 47e260 <main.add>
Instead of copying the function arguments onto the stack, they're copied into the AX and BX registers. This is the essence of register based calling.
The xchg %ax,%ax
instruction is a bit more mysterious and I only have
theories regarding what it's for. Email me if you know and I'll add the detail
here.
Update: The xchg %ax,%ax
instruction is almost certainly there to work
around a bug
in a number of Intel processors. The instruction ("exchange AX
with AX") does nothing but introduces two bytes of padding before the call
instruction that follows - this serves to work around the processor bug. There's a
Go GitHub issue which has much more detail.
Thank you to the many people who wrote about this.
As we've already seen earlier, the call
instruction moves execution to the
add
function.
Now let's take a look at add
:
000000000047e260 <main.add>:
add %rbx,%rax
ret
Well that's simple! Unlike the Go 1.16 version, there's no need to move
arguments from the stack into registers in order to add them, and there's no
need to move the result back to the stack. The function arguments are expected to
be in the AX and BX registers, and the return value is expected to come back
via AX. The ret
instruction moves execution back to where call
was executed,
using the return address that call
left on the stack.
With so much less work being done when handling function arguments and return values, it's starting to become clearer why register based calling might be faster.
So how much faster is register based calling? I created a simple Go benchmark program to check:
package main
import "testing"
//go:noinline
func add(i, j int) int {
return i + j
}
var result int
func BenchmarkIt(b *testing.B) {
x := 22
y := 33
var z int
for i := 0; i < b.N; i++ {
z = add(x, y)
}
result = z
}
Note the use of a variable outside of the benchmark function to ensure that the
compiler won't optimise
the assignment to z
away.
The benchmark can be run like this:
go test bench_test.go -bench=.
On my somewhat long in the tooth laptop, the best result I could get under Go 1.16 was:
BenchmarkIt-4 512385438 2.292 ns/op
With Go 1.17:
BenchmarkIt-4 613585278 1.915 ns/op
That's a noticeable improvement - a 16% reduction in execution time for our example. Not bad, especially as the improvement comes for free for all Go programs just by using a newer version of the compiler.
I hope you found it interesting to explore some lower level details of software that we don't think about much these days and that you learned something along the way.
Many thanks to Ben Hoyt and Brian Thorne for their detailed reviews and input into this article.
Update: This article ended up generating quite a bit of discussion elsewhere, in particular:
I gave an online (COVID lockdown) presentation to the Christchurch Python Meetup tonight that covered a host of new features that were added to Python in versions 3.8 and 3.9. While preparing for the presentation I was blown away by how quickly Python is evolving (and I didn't even get to the new pattern matching features coming in 3.10).
The slides are available here.
IMAPClient 2.2.0 is out!
The most critical change in this release is a fix to avoid an exception when creating an IMAPClient instance under Python 3.9. imaplib (used by IMAPClient internally) now supports connection timeouts and this conflicted with IMAPClient's own timeout handling.
Other highlights for this release:
There are many more changes in this release. See the release notes for more details. Thanks you so much to the many contributors to the project.
As always, IMAPClient can be installed using pip (pip install imapclient
). Full documentation is available at Read the Docs.
use-package is an Emacs package which allows packages to be loaded declaratively. It's been around for ages and I've seen it used in other people's configurations, but I've only recently paid some real attention to it. I wish I'd learned how to use it sooner - it's really improved my Emacs config.
Let's look at an example:
(use-package magit
:bind (("C-x g" . magit-status)
("C-x C-g" . magit-status)))
When this is run, the use of the Magit package is declared and two key-bindings for its main function are defined. What's really nice here is that autoloads are installed for those key-bindings - the package isn't actually loaded until I actually type either of those shortcuts for the first time. This means Emacs can start faster and resources aren't used for features I don't use often (that said, Magit is something I do use all the time!).
Note use-package's clear, compact syntax. There's less ceremony for common startup tasks such as setting up key-bindings and I love how use-package encourages all setup related to a given package to be neatly grouped together.
It's common to need to run some code before or after a package is
loaded in order to set it up. With use-package this is done using
:init
(before loading) and :config
(after loading). Here's the
above example with some "helpful" messages inserted printed before and
after the magit package is loaded:
(use-package magit
:init
(message "Loading Magit!")
:config
(message "Loaded Magit!")
:bind (("C-x g" . magit-status)
("C-x C-g" . magit-status)))
Again, because loading of the package is deferred until one of the key-bindings is used, the messages won't appear until I actually hit one of those keys.
Key-bindings are just one way that a deferred package might be loaded
by use-package. Other mechanisms include assigning a file extension to
a mode defined in the package (via :mode
) or by adding a function
from the package into a hook (via :hook
). use-package provides
a variety of syntactic sugar to make this painless and concise.
Of course there are some packages which you always want to be loaded
immediately. use-package can handle this too through the :demand
keyword. Here's an example:
(use-package evil
:demand t
:custom
(evil-esc-delay 0.001 "avoid ESC/meta mixups")
(evil-shift-width 4)
(evil-search-module 'evil-search)
:bind (:map evil-normal-state-map
("S" . replace-symbol-at-point))
:config
;; Enable evil-mode in all buffers.
(evil-mode 1))
Here we have the Evil package
being loaded immediately (due to the :demand t
) with some configuration
set before it's loaded (some customisations need to be set before
loading), a key-binding added, and evil-mode being enabled globally.
Note the use of the :custom
keyword here. This is a clean way of
setting customisations that you could also set with Emacs' customize
functionality. It can be nice to keep customizations with the
use-package declaration, although I'm not that consistent about this
myself.
In the previous example, the key-binding is slightly different to earlier examples because the binding is being set inside a specific keymap instead of globally. use-package provides clean syntax for this. Here's an example with multiple key-bindings being set up across multiple keymaps:
(use-package evil-args
:bind (:map evil-inner-text-objects-map
("a" . evil-inner-arg)
:map evil-outer-text-objects-map
("a" . evil-outer-arg)
:map evil-normal-state-map
("L" . evil-forward-arg)
("K" . evil-jump-out-args)
:map evil-normal-state-map
("H" . evil-backward-arg)
("L" . evil-forward-arg)
:map evil-motion-state-map
("H" . evil-backward-arg)
("L" . evil-forward-arg)))
Not bad! Much better than:
(define-key evil-inner-text-objects-map (kbd "a") 'evil-inner-arg)
(define-key evil-outer-text-objects-map (kbd "a") 'evil-outer-arg)
(define-key evil-normal-state-map (kbd "L") 'evil-forward-arg)
; you get the idea ...
By default, use-package only loads packages that have already installed somehow, but it can integrate with a package manager too.
If you're already using the built-in Emacs package manager
(package.el)
then simply adding :ensure t
to a use-package
block will cause
use-package to download and install the package if it's not already
there. Extending our first example slightly:
(use-package magit
:ensure t
:bind (("C-x g" . magit-status)
("C-x C-g" . magit-status)))
This avoids the need to separately call package.el's install-package
function or use the list-packages
interface to install a package.
use-package can also work with other package managers. The powerful straight.el package manager has tight integration with use-package (it's what I use now).
If you want to learn more about use-package, the official README is approachable and comprehensive. There's plenty more to it than what I've covered here, although you don't need to know much to start see its benefits.
Other articles that you might also find helpful:
Edit 2020-05-15: Use :custom
instead of setq
in :init
. Thanks Canatella.
After a recent Christchurch Python meetup, I was asked to create a list of Python libraries and tools that I tend to gravitate towards when working on Python projects. The request was aimed at helping newer Pythonistas find a way through the massive Python ecosystem that exists today.
This is my attempt at such a list. It's highly subjective, being coloured by my personal journey. I've done a lot of Python in my career but that doesn't mean that I've necessarily picked the best tool for each task. Still, I hope it's useful as a starting point (I don't think there's any big surprises here).
Many programming language communities are waking up to the fact that
having a tool which takes care of code formatting for you is a productivity booster
and avoids pointless arguments within teams. Go can take much of the credit for the
recent interest in code formatters with the gofmt
tool that ships
with the standard Go toolchain. Rust has
rustfmt, most JavaScript
projects seem to prefer prettier and an
automatic formatter seems to now be de rigueur for any new language.
A few formatters exist for Python but Black seems to be fast becoming the default choice, and rightly so. It makes sensible formatting decisions (the way it handles line length is particularly smart) and has few configuration options so everyone's code ends up looking the same across projects.
All Python projects should all be using Black!
Python has a number of options for processing command line arguments but I prefer good old argparse which has been in the standard library since Python 3.2 (and 2.7). It has a logical API and is capable enough for the needs of most programs.
The standard library has a perfectly fine xUnit style testing package in the form of unittest but pytest requires less ceremony and is just more fun. I really like the detailed failure output when tests fail and the fixtures mechanism which provides a more powerful and clearer way of reusing test common functionality than the classic setup and teardown approach. It also encourages composition in tests over inheritance.
pytest
's extension mechanisms are great too. We have a handy custom test report hook for
for the API server tests at The Cacophony Project which includes
recent API server logs in the output when tests fail.
So much data ends up being available in CSV or similarly formatted files and I've done my fair share of extracting data out of them or producing CSV files for consumption by other software. The csv package in the standard library is well designed and flexible workhorse that deserves more praise.
The standard datetime package from the standard library is excellent and ends up getting used in almost every Python program I work on. It provides convenient ways to represent and manipulate timestamps, time intervals and time zones. I frequently pop open a Python shell just to do some quick ad hoc date calculations.
datetime
intentionally doesn't try to get too involved with the
vagaries of time zones. If you need to represent timestamps in
specific timezones or convert between them, the
pytz package is your friend.
There are times where you need to do more complicated things with
timestamps and that's where
dateutil comes in. It supports
generic date parsing, complex recurrence rules and relative delta
calculations (e.g. "what is next Monday?"). It also has a complete
timezone database built in so you don't need pytz
if you're using
dateutil
.
Shell scripts are great for what they are but there are also real benefits to using a more rigorous programming language for the tasks that shell scripts are typically used for, especially once a script get beyond a certain size or complexity. One way forward is to use Python for its expressiveness and cleanliness and the plumbum package to provide the shell-like ease of running and chaining commands together that Python lacks on it's own.
Here's a somewhat contrived example showing plumbum
's command chaining
capabilities combined with some Python to extract the first 5 lines:
from plumbum.cmd import find, grep, sort
output = (find['-name', '*.py'] | grep['-v', 'python2.7'] | sort)()
for line in output.splitlines()[:5]
print(line)
In case you're wondering, the name is Latin for lead, which is what pipes used to be made from (and also why we also have plumbers).
Python class creation with a lot less boilerplate. attrs turns up all over the place and with good reason - you end up with classes that require fewer lines to define and behave correctly in terms of Python's comparison operators.
Here's a quick example of some of the things that attrs gives you:
>>> import attr
>>> @attr.s
... class Point:
... x = attr.ib(default=0)
... y = attr.ib(default=0)
...
>>> p0 = Point() # using default values
>>> p1 = Point(0, 0) # specifying attribute values
# equality implemented by comparing attributes
>>> p0 == p1
True
>>> p2 = Point(3, 4)
>>> p0 == p2
False
>>> repr(p2) # nice repr values
'Point(x=3, y=4)'
There's a lot more to attrs than this example covers, and most default behaviour is customisable.
It's worth nothing that data classes in
Python 3.7 and later offer some of the features of attrs, so you could
use those if you want to stick to the standard library. attrs
offers a
richer feature set though.
If you're making HTTP 1.0/1.1 requests with Python then you should almost certainly be using requests. It can do everything you need and then some, and has a lovely API.
As far as HTTP 2.0 goes, it seems that requests 3 will have that covered, but it's a work in progress at time of writing.
Effective use of virtual environments is crucial for a happy Python development experience. After trying out a few approaches for managing virtualenvs, I've settled on pew as my preferred tool of choice. It feels clean and fits the way I work.
That's what is in my Python toolbox. What's in yours? I'd love to hear your thoughts in the comments.
I had the pleasure of giving a talk about Python virtual environments at this week's Christchurch Python meetup. It described the problem that virtualenvs solve, some gotchas and the tools people use to create and manage them. We also spent some time on some of the newer entrants in this space including pew, pipenv and poetry. The slides are available.
Giving presentations is a great way of solidifying your knowledge of a particular subject - you want to make sure you're describing things accurately so end up doing extra research and thinking more deeply. I'm sure I get as much out of preparing for a talk as the people who attend.
pew is rapidly becoming my preferred virtualenv tool (over virtualenvwrapper). It feels a little cleaner to use and some nice touches.
I've just published a Jupyter Notebook I used to present an introduction to the excellent Python attrs package at the November 2018 Christchurch Python meetup.
You can find it on Github.
influx-spout 2.1 has just been released and it includes a bunch of exciting new features. Here's the highlights...
The biggest addition is a new "downsampler" component which is useful for creating rolled up versions of metrics streams for long term archive. Generating rolled up versions of measurements in real-time is more straightforward than trying to create them from already stored measurements later, especially at large volumes. This approach also eliminates the extra load on the short-term database caused by batch rollup operations.
The downsampler reads measurements once they have been processed by influx-spout's filter and averages them over a configured period. The averaged measurements are emitted at the end of each sampling period at clean time boundaries. For example if the sampling period is 10 minutes then measurements will be emitted at 0 minutes, 10 minutes, 20 minutes and so on past the hour.
Downsampled lines are emitted for consumption by an influx-spout writer which will deliver them to an InfluxDB instance.
influx-spout's filter component now supports a new filter rule type which allows matching against one or more tags in measurement lines. Using this rule type is much faster than using regex rules to achieve the same kinds of matches - 80-1000% faster depending on number tags to match and line size. This approach is also much safer as matching is independent of tag order and matches tag keys and values precisely.
The filter component now orders tag keys in all measurement lines passing through it. This ensures the best performance when measurements are inserted into InfluxDB. Predictable tag ordering is also required by the downsampler.
Various optimisations have been made in the filter to avoid unnecessary processing and allocations around line unescaping.
The hashes of measurement names are now only computed once which speeds up matching when multiple "basic" rules are in use.
If you or your organisation is a power user of InfluxDB you should really take a look at influx-spout. Version 2.1 is available via the project's Releases page on Github.
Also, if you're into Go development and performance sensitive software, influx-spout's code is worth studying.
IMAPClient 2.1.0 has just been released! Here's the main highlights:
email
package from the standard library to parse fetched emailsAs always, IMAPClient can be installed using pip (pip install imapclient
). Full documentation is available at Read the Docs.
Enjoy!
As well as my main gig, I do some work with the excellent folks at Jump Trading. My main focus there so far has been finalising and open sourcing a project - implemented in Go - called influx-spout.
influx-spout will primarily be of interest to you if:
influx-spout sits in between an incoming fire-hose of InfluxDB measurements (potentially from thousands of hosts) and one or more InfluxDB instances (or other services which accept the InfluxDB Line Protocol such as Kapacitor). It accepts InfluxDB measurements over either UDP or HTTP, applies sanity checks, batching, filtering and routing, and then writes out the measurements to their final destinations.
influx-spout provides flexibility: measurements can be sharded across backends, some classes of measurements can duplicated to multiple backends and measurements which are no longer important can be dropped before they get near a disk.
influx-spout also allows for easy changes in the way measurements are handled without requiring changes to the systems producing the measurements. External systems are configured to send measurements to single, static endpoint (i.e. influx-spout) with any changes to the way measurements are handled taken care of by changes to influx-spout's configuration.
As a Go developer, influx-spout is interesting because it of the high volumes of data that it needs to support. Here's a few things we've done to ensure that influx-spout can handle large data volumes:
If you're into Go development and performance sensitive software, the influx-spout code is worth studying.
influx-spout v2.0.0 has just been released can downloaded from the project's Releases page. There's lots more information about the project in the project launch post and in the README.
I recently had to write some NodeJS code which uses the AWS SDK to
list all the objects in a S3 bucket which potentially contains many
objects (currently over 80,000 in production). The S3 listObjects
API will only return up to 1,000 keys at a time so you have to make
multiple calls, setting the Marker
field to page through all the
keys.
It turns out there's a lot of sub-optimal examples out there for how to do this which often involve global state and complicated recursive callbacks. I'm also a fan of the clarity of JavaScript's newer async/await feature for handling asynchronous code so I was keen on a solution which uses that style.
Here's what I came up with:
async function allBucketKeys(s3, bucket) {
const params = {
Bucket: bucket,
};
var keys = [];
for (;;) {
var data = await s3.listObjects(params).promise();
data.Contents.forEach((elem) => {
keys = keys.concat(elem.Key);
});
if (!data.IsTruncated) {
break;
}
params.Marker = data.NextMarker;
}
return keys;
}
It's called like this:
// Remember to catch exceptions somewhere...
const s3 = connectToS3Somehow();
var keys = await allBucketKeys(s3, "my_bucket");
This solution is clean, concise and hopefully straightforward.
An important aspect that supports this solution is that the AWS API
can return a Promise for a call (via .promise()
) which can then be
used with await
. Given the need to conditionally call listObjects
multiple times, an arguably clearer code structure can be achieved using
await instead of callbacks.
I recently gave a talk at the Christchurch Python meetup which discussed the strengths and weaknesses of shell scripts, and why you might want to consider using Python instead of shell scripts. I went into some areas where Python is arguably worse than shell scripts, and then we dived into the excellent Plumbum package which nicely addresses most of those weaknesses.
The slides for the talk are available here.
I'm very happy to announce that IMAPClient 2.0.0 is out. Many thanks to all the contributors who helped to shape what is one of the biggest IMAPClent releases to date.
A major focus of this release was removing the dependency on backports.ssl and pyOpenSSL for TLS support. These dependencies were introduced in IMAPClient 1.0 to provide consistent TLS support, even for Python versions which didn't have good support in the standard library. Unfortunately they caused installation headaches for many people and had various rough edges.
Now that Python has solid built-in TLS support in readily available
Python versions (2.7.9+ and 3.4+), it was time for IMAPClient to go back
to using the standard library ssl
package. This should make most
IMAPClient installation problems are thing of the past, and also
significantly reduces the dependences that get pulled in when IMAPClient
is installed.
Other highlights in this release:
debug
and
log_file
attributes are gone.imapclient
package.welcome
property to allow access to IMAP server greeting.See the release notes for more details.
IMAPClient 1.1.0 has just been released! Many thanks to the recent contributors and the project's other maintainers, Nicolas Le Manchet and Maxime Lorant. This release is full of great stuff because of them. Here's some highlights:
See the release notes for more details.
Much work has already gone into the 2.0 release and it won't be too far behind 1.1.0. The headline change there is reworking IMAPClient's handling of TLS.
This week I had the pleasure of giving a talk to Data Science masters students at the University of Canterbury (NZ). It was my attempt at giving some real-world advice from many years working as a technology geek. I think it went well and I hope the audience got something useful out of it.
The slides are now available.
I also managed to promote my current gig as well. Expect an article about what I've been up to soon.
The Go compiler normally stops after it finds 10 errors, aborting with a
too many errors
message. For example:
$ go build
# sandbox/manyerrs
./errs.go:4: undefined: w
./errs.go:5: undefined: w
./errs.go:6: undefined: w
./errs.go:7: undefined: w
./errs.go:8: undefined: w
./errs.go:9: undefined: w
./errs.go:10: undefined: w
./errs.go:11: undefined: w
./errs.go:12: undefined: w
./errs.go:13: undefined: w
./errs.go:13: too many errors
This is useful default behaviour - if there's lots problems you usually don't care about seeing all of the issues. You just fix what you can see and try again.
Sometimes though you really want to see all the errors.
The trick is to use the -gcflags
option to the go
tool to pass -e
(show all errors) to the compiler.
Here's how you do it:
$ go build -gcflags="-e"
./errs.go:4: undefined: w
./errs.go:5: undefined: w
./errs.go:6: undefined: w
./errs.go:7: undefined: w
./errs.go:8: undefined: w
./errs.go:9: undefined: w
./errs.go:10: undefined: w
./errs.go:11: undefined: w
./errs.go:12: undefined: w
./errs.go:13: undefined: w
./errs.go:14: undefined: w
./errs.go:15: undefined: w
./errs.go:16: undefined: w
./errs.go:17: undefined: w
Given that this was surprisingly difficult to find this I thought I'd write it down here. Hope this was useful.
I gave a talk at last night's Christchurch Python meetup about Python's relatively new asynchronous programming features. To be honest, I didn't know all that much about the topic and signed myself up for the talk to force myself to learn :)
I used Jupyter Notebook for the presentation. Its ability to mix together text and interactive Python snippets works really well for this kind of talk. I've published the slides on Github as it has native support for rendering Jupyter notebooks (thanks Github!).
The code has been imported, tickets have migrated and the Bitbucket repository is now a redirect: the IMAPClient project is now using Git and Github.
The Github link is: https://github.com/mjs/imapclient
(http://imapclient.freshfoo.com redirects to the new location)
This change has been planned for a long time. I've noticed a that many contributors struggle with Mercurial and Bitbucket, and I greatly prefer Git myself nowadays.
Please email the mailing list or me personally if you see any problems with migration or have any questions.
IMAPClient 1.0.2 is out! This is release comes with a few small fixes and tweaks, as well as a some documentation improvements.
Specifically:
I announced that the project would be moving to Git and Github some time ago and this is finally happening. This release will be the last release where the project is on Bitbucket.
Those paying attention may have noticed that this site has changed. Up until recently, freshfoo.com was generated using a combination of PyBlosxom and rest2web but it's now using the wonderful Nikola static site generator. The change was prompted by a desire to have more flexibility about freshfoo.com's hosting situation - moving a static site around is far easier than a dynamically generated one. A static site also provides a snappier experience for the end user and there really wasn't a need for the site to be dynamically generated anyway.
Nikola also provides some nice tooling for the site author to help with the writing process. The nikola auto command which serves up the site locally, automatically rebuilding as you work on it and automatically refreshing browser tabs which are viewing the site is especially useful.
The pre-existing articles for the site were written using a combinartion of raw HTML (back from the Blogger days!) and reStructuredText which are both supported by Nikola (as well as Markdown and more), so converting them for use by Nikola required only minor massaging. The URL scheme for posts and articles has changed a little but there are permanent redirects in place to map the old URLs to their new locations. I think the new URL scheme is somewhat cleaner anyway.
The Disqus comments for historic articles have been preserved despite the URL changes for articles. Disqus provides a handy URL Mapper tool to support these situations.
I'm really happy with the result but it's possible that I might have missed something. Please leave a comment below if you see something that isn't right.
The IMAPClient mailing list has been hosted at Librelist for a while. It worked OK initially but the site doesn't seem to be getting much love these days. People also tend to find the "send an email to subscribe" model confusing.
After looking at a number of options, I've decided to shift the mailing list to Groups.io. It's a well run mailing list service which allows interaction via both the web and email. There's lots of nice bells and whistles but the main benefit I see is that it seems to be well thought out and efficient.
To join the new list or to find out more, take a look at https://groups.io/g/imapclient.
The change to the new mailing list is effective immediately. Please use the new one from now on. I'm going to send out invites to people who were active on the old list. IMAPClient's documentation has already been updated to reflect the change.
To ensure they aren't lost, the historical messages from the old list on librelist.com will be imported into new one.
Please email me or comment here if you have any feedback about the change.
IMAPClient 1.0.0 is finally here! This is a monster release, bursting with new features and fixes.
Here's the highlights:
Enhanced TLS support: The way that IMAPClient establishes TLS connections has been completely reworked. By default, IMAPClient will attempt certificate verification and certificate hostname checking, and will not use known-insecure TLS settings and protocols. In addition, TLS parameters are now highly configurable.
This change necessitates that backwards compatibility has been broken, and also means that IMAPClient has a bunch of new dependencies. Please see the earlier blog article about the TLS changes as well as the release notes for more information.
STARTTLS support: When the server supports it, IMAPClient can now establish an encrypted connection after initially starting with an unencrypted connection using the STARTTLS command. The starttls method takes an SSL context object for controlling the parameters of the TLS negotiation.
Many thanks to Chris Arndt for his extensive initial work on this.
Robust search criteria handling: IMAPClient's methods that accept search criteria have been changed to provide take criteria in a more straightforward and robust way. In addition, the way the charset argument interacts with search criteria has been improved. These changes make it easier to pass search criteria and have them handled correctly.
Unfortunately these changes also mean that small changes may be required to existing code that uses IMAPClient. Please see the earlier blog article about the search changes as well as the release notes for more information.
Socket timeout support: IMAPClient now accepts a timeout at creation time. The timeout applies while establishing the connection and for all operations on the socket connected to the IMAP server.
Semantic versioning: In order to better indicate version compatibility to IMAPClient's users, the project will now strictly adhere to the Semantic Versioning scheme.
Performance optimisation for parsing message id lists: A short circuit is now used when parsing a list of message ids which greatly speeds up parsing time.
Installation via wheels: In addition to .zip and .tar.gz files, IMAPClient releases are now also available as universal wheels.
There have also been many other smaller fixes and improvements. See the release notes and manual for more details.
IMAPClient can be installed from PyPI (pip install imapclient
) or
downloaded via IMAPClient site.
This release couldn't have been possible with the amazing support of Nylas. If you're developing software that needs to deal with email, save yourself a whole lot of pain and check out their email platform. If you're after a modern, extensible, cross-platform email client check out N1.
Following on from last week's post about the upcoming IMAPClient 1.0 release, I'd like to draw attention to some significant, compatibility breaking changes related to the handling of search criteria.
IMAPClient's methods that accept search criteria (search, sort, thread, gmail_search) have been changed to provide take criteria in a more straightforward and robust way. In addition, the way the charset argument interacts with search criteria has been improved. These changes make it easier to pass search criteria and have them handled correctly but unfortunately also mean that small changes may be required to existing code that uses IMAPClient.
The preferred way to specify criteria now is as a list of strings, ints and dates (where relevant). The list should be flat with all the criteria parts run together. Where a criteria takes an argument, just provide it as the next element in the list.
Some valid examples:
c.search(['DELETED'])
c.search(['NOT', 'DELETED'])
c.search(['FLAGGED', 'SUBJECT', 'foo', 'BODY', 'hello world'])
c.search(['NOT', 'DELETED', 'SMALLER', 1000])
c.search(['SINCE', date(2006, 5, 3)])
IMAPClient will perform all required conversion, quoting and encoding. Callers do not need to and should not attempt to do this themselves. IMAPClient will automatically send criteria parts as IMAP literals when required (i.e. when the encoded part is 8-bit).
Some previously accepted ways of passing search criteria will not work as they did in previous versions of IMAPClient. Small changes will be required in these cases. Here are some examples of how to update code written against older versions of IMAPClient:
c.search(['NOT DELETED']) # Before
c.search(['NOT', 'DELETED']) # After
c.search(['TEXT "foo"']) # Before
c.search(['TEXT', 'foo']) # After (IMAPClient will add the quotes)
c.search(['DELETED', 'TEXT "foo"']) # Before
c.search(['DELETED', 'TEXT', 'foo']) # After
c.search(['SMALLER 1000']) # Before
c.search(['SMALLER', 1000]) # After
It is also possible to pass a single string as the search criteria. IMAPClient will not attempt quoting in this case, allowing the caller to specify search criteria at a lower level. Specifying criteria using a sequence of strings is preferable however. The following examples (equivalent to those further above) are valid:
c.search('DELETED')
c.search('NOT DELETED')
c.search('FLAGGED SUBJECT "foo" BODY "hello world"')
c.search('NOT DELETED SMALLER 1000')
c.search('SINCE 03-May-2006')
The way that the search charset argument is handled has also changed.
Any unicode criteria arguments will now be encoded by IMAPClient using the supplied charset. The charset must refer to an encoding that is capable of handling the criteria's characters or an error will occur. The charset must obviously also be one that the server supports! (UTF-8 is common)
Any criteria given as bytes will not be changed by IMAPClient, but the provided charset will still be passed to the IMAP server. This allows already encoding criteria to be passed through as-is. The encoding referred to by charset should match the actual encoding used for the criteria.
The following are valid examples:
c.search(['TEXT', u'\u263a'], 'utf-8') # IMAPClient will apply UTF-8 encoding
c.search([b'TEXT', b'\xe2\x98\xba'], 'utf-8') # Caller has already applied UTF-8 encoding
The documentation and tests for search, gmail_search, sort and thread has updated to account for these changes and have also been generally improved.
For those willing to try out the changes now please install from IMAPClient's tip. Any feedback on the changes and/or documentation would be hugely appreciated.
IMAPClient 1.0 is really close to being done now and it's going to be one of the biggest releases in its history (thanks largely to the support of the good people at Nylas).
The headline feature of this release is the completely revamped TLS support. With 1.0, IMAPClient will perform certificate verification, certificate hostname checking, and will not use known-insecure TLS settings and protocols - by default. In order to work around Python historically patchy TLS support1, IMAPClient uses backports.ssl and pyOpenSSL to provide consistent TLS functionality across all supported Python versions (2.6, 2.7, 3.3 and 3.4).
All this goodness doesn't come for free however. There were some compromises and compatibility breaks required to make it work:
There's a new section in the manual which has more details and includes examples of how to tweak the SSL context for some common scenarios.
For those willing to try out the changes now please install from IMAPClient's tip. Any feedback would be hugely appreciated.
Note that due to the hard work of various folks, TLS support is much better in Python 3.4 and 2.7.9.
I'm chuffed to announce that IMAPClient 0.13 is out!
Here's what's new:
See the NEWS.rst file and manual for more details.
IMAPClient can be installed from PyPI (pip install imapclient
) or
downloaded from the IMAPClient site.
I'm also excited to announce that Nylas (formerly Inbox) has now employed me to work on IMAPClient part time. There should be a significant uptick in the development of IMAPClient.
The next major version of IMAPClient will be 1.0.0, and will be primarily focussed on enhancing TLS/SSL support.
I'm very happy to announce that IMAPClient 0.12 is out!
This is a big release. Some highlights:
See the NEWS.rst file and manual for more details.
Many thanks go to Inbox for sponsoring the significant unicode changes in this release.
IMAPClient can be installed from PyPI (pip install imapclient
) or
downloaded from the IMAPClient site.
The next major version of IMAPClient will be 1.0.0, and will be primarily focussed on enhancing TLS/SSL support.
I gave a presentation introducting IMAPClient at the monthly Christchurch Python meetup on Thursday. It included a brief introduction to the IMAP protocol, the motivation for creating the IMAPClient package, some examples of how it compares to using imaplib from the standard library, and some discussion of future plans. There also may have been some silly pictures. The talk seemed to be well received. Thanks to everyone who attended.
I'm not sure how useful they'll be without the vocal part but the slides are now available online.
Apart from my talk, a wide range of topics came up including installing and managing multiple versions of Python side-by-side, TLS (SSL) and Python and the suckiness (or not) of JavaScript. As always, lots of fun. There's always a great bunch of people who make it along.
I've been wanting to do this for a while: IMAPClient is now completely hosted on Bitbucket. The Trac instance is no more and all tickets have been migrated to the Bitbucket issue tracker. http://imapclient.freshfoo.com now redirects to the Bitbucket site.
The primary motivation for this change is that it makes it possible for anyone to easily interact with IMAPClient's tickets using the Bitbucket account they probably already have. Due to spam issues, I had to turn off public write access to the Trac bug tracker a long time ago meaning I was the only one creating and updating tickets. This change also means less maintenance overhead for me, so more time to spend on IMAPClient.
Please let me know if you see anything related to the migration that doesn't look right.
I gave a talk last night at the monthly Christchurch Python meetup titled "Go For Pythonistas". It was an introduction to the Go programming language from the perspective of a Python developer. The talk went well, with plenty of questions and comments throughout. Thanks to all who attended for your interest!
I'm not sure how useful they'll be on their own but the slides for the talk are available here.
I mentioned Juju in the talk a few times which piqued a few people's interest so I also ended up doing a short impromptu talk about that. Juju is the amazing cloud orchestration software that I hack on at Canonical these days (written in Go). It's worth a few blog articles in itself.
Now that they have officially launched I can happily announce that the good folks at Inbox are sponsoring the development of certain features and fixes for IMAPClient. Inbox have just released the initial version of their open source email sync engine which provides a clean REST API for dealing with email - hiding all the archaic intricacies of protocols such as IMAP and MIME. IMAPClient is used by the Inbox engine to interact with IMAP servers.
The sponsorship of IMAPClient by Inbox will help to increase the speed of IMAPClient development and all improvements will be open source, feeding directly in to trunk so that all IMAPClient users benefit. Thanks Inbox!
The first request from Inbox is to fix some unicode/bytes handling issues that crept in as part of the addition of Python 3 support. It's a non-trivial amount of work but things are going well. Watch this space...
This is a somewhat belated announcement that IMAPClient 0.11 is out (and has been for a little over a week now). Notable changes:
Big thanks go out to John Louis del Rosario, Naveen Nathan, Brandon Rhodes and Benjamin Morrise for their contributions towards this release.
See the NEWS.rst file and manual for more details.
IMAPClient can be installed from PyPI (pip install imapclient
) or
downloaded from the IMAPClient site.
IMAPClient 0.10 has just been released. This is an important release because it's the first to support Python 3!
Here's the highlights:
HIGHESTMODSEQ
item in SELECT
responses is now parsed
correctly--port
command line bug in imapclient.interact when SSL
connections are made.tox
.python setup.py test
now runs the unit testsA massive thank you to Mathieu Agopian for his massive contribution to getting the Python 3 support finished. His changes and ideas feature heavily in this release.
See the NEWS.rst file and manual for more details.
IMAPClient can be installed from PyPI (pip install imapclient
) or
downloaded from the IMAPClient site.
As I've mentioned in previous blog articles, my wife and I have been working on driving an analog VU meter based on the sound going out the Raspberry Pi's audio outputs. This now works!
Here's a video demonstrating the result:
The music1 is playing from a Raspberry Pi, with software running on the Pi digitally sampling the peak output audio level and writing that out to an 8-bit digital-to-analog converter (DAC). The DAC output is then used to drive the analog meter. If you're interesting in knowing how all this hangs together, keep reading.
The Raspberry Pi is very capable of generating digital signals using its GPIO pins but an analog meter requires an analog signal. To convert a digital value emitted from the Raspberry Pi to an analog voltage we need a DAC. It's possible, and rewarding, to build a DAC using discrete components, but it's more convenient, accurate and reliable if you use an off-the-shelf DAC packaged as an integrated circuit (IC).
We've chosen to use the Analog Devices AD557. It has the following features:
Here's how the Raspberry Pi, DAC and VU meter were connected together:
Various GPIO pins on the Raspberry Pi are used to provide a number between 0 and 255 to the data input pins (1 through 7) on the DAC. The power supply and ground for the DAC are provided by the 5V power and ground pins available on the Raspberry Pi GPIO header.
The slightly odd seeming output configuration involving three Vout pins is actually just what is required for normal operation. These three pins allow for alternate configurations where a non-standard output voltage range is required. See the AD577 documentation for more details.
The CS (chip select) and CE (chip enable) pins on the AD557 warrant some explanation. The DAC will only change its output voltage when both of these pins are low (at ground). This means that you can ensure you have the right values set on the data pins before telling the AD557 to update the output to the next value, ensuring unintended changes in output voltage are avoided while a new value is being set on the inputs.
There are use cases that require both the CS and CE pins to be used which I won't go in to here. Since we only need one of these pins here, CS is connected permanently to ground. CE is connected to a GPIO pin so we can control it from software running on the Raspberry Pi. The value on CE pin needs to be dropped from 1 to ground 0 for at least 225ns in order for the AD557 to update it's output voltage to a new value.
The 15K resistor is there to limit current. The Sifam audio level meter we're using is current driven, and needs very little current at that. It will show 0 (where the meter goes in to the red) at around 300uA. At the maximum output voltage from the DAC (5V), a 15K resistor will ensure the meter gets around 333uA (5 / 15000) which puts the needle nicely just in to the red zone.
Note that although this meter is marked as a VU meter, we are in no way indicating a true VU reading. The VU term has a strict definition which we're not trying to emulate in any real way here.
The code that drives this circuit is available on GitLab. The files are:
Have a look around. There really isn't that much code involved.
I hope you've found this post interesting and/or educational. My wife and I are about to solder up the circuit and get it and the Pi in to some kind of box. I'll certainly post some pictures of the result once that's done.
Irving Berlin's Russian Lullaby, performed by John Coltrane and band.
I'm working on driving an analog VU meter from my Raspberry Pi using whatever audio is going out the Pi's sound outputs. The de facto Linux sound system, PulseAudio, allows any sound output (or "sink" in PulseAudio's nonclementure) to be monitored. In PulseAudio land, each sink has a corresponding "source" called the monitor source which can be read just like any other other PulseAudio input such as a microphone. In fact, to help with volume meter style applications, PulseAudio even allows you to ask for peak level measurements, which means you can sample the monitor sink at a low frequency, with low CPU utilisation, but still produce a useful volume display. When this feature is used, each sample read indicates the peak level since the last sample.
The main PulseAudio API is asynchronous and callback based, and the documentation is primarly just an API reference. This makes it a little difficult to figure out how to get everthing to hang together. Using the code from various open source projects (primarily veromix-plasmoid and pavumeter), along with the API reference, I was able to develop a fairly minimal code example that will hopefully be useful to others trying to do something similar. Although this example is written in Python, it is using the PulseAudio C API directly (via ctypes) so it should hopefully still be relevant if your application is written in C or another language.
Here's the demo code. The latest version is also available on GitLab. Note that in order to run this example you need to install Vincent Breitmoser's ctypes PulseAudio wrapper available at https://github.com/Valodim/python-pulseaudio.
import sys
from Queue import Queue
from ctypes import POINTER, c_ubyte, c_void_p, c_ulong, cast
# From https://github.com/Valodim/python-pulseaudio
from pulseaudio.lib_pulseaudio import *
# edit to match your sink
SINK_NAME = 'alsa_output.pci-0000_00_1b.0.analog-stereo'
METER_RATE = 344
MAX_SAMPLE_VALUE = 127
DISPLAY_SCALE = 2
MAX_SPACES = MAX_SAMPLE_VALUE >> DISPLAY_SCALE
class PeakMonitor(object):
def __init__(self, sink_name, rate):
self.sink_name = sink_name
self.rate = rate
# Wrap callback methods in appropriate ctypefunc instances so
# that the Pulseaudio C API can call them
self._context_notify_cb = pa_context_notify_cb_t(self.context_notify_cb)
self._sink_info_cb = pa_sink_info_cb_t(self.sink_info_cb)
self._stream_read_cb = pa_stream_request_cb_t(self.stream_read_cb)
# stream_read_cb() puts peak samples into this Queue instance
self._samples = Queue()
# Create the mainloop thread and set our context_notify_cb
# method to be called when there's updates relating to the
# connection to Pulseaudio
_mainloop = pa_threaded_mainloop_new()
_mainloop_api = pa_threaded_mainloop_get_api(_mainloop)
context = pa_context_new(_mainloop_api, 'peak_demo')
pa_context_set_state_callback(context, self._context_notify_cb, None)
pa_context_connect(context, None, 0, None)
pa_threaded_mainloop_start(_mainloop)
def __iter__(self):
while True:
yield self._samples.get()
def context_notify_cb(self, context, _):
state = pa_context_get_state(context)
if state == PA_CONTEXT_READY:
print "Pulseaudio connection ready..."
# Connected to Pulseaudio. Now request that sink_info_cb
# be called with information about the available sinks.
o = pa_context_get_sink_info_list(context, self._sink_info_cb, None)
pa_operation_unref(o)
elif state == PA_CONTEXT_FAILED :
print "Connection failed"
elif state == PA_CONTEXT_TERMINATED:
print "Connection terminated"
def sink_info_cb(self, context, sink_info_p, _, __):
if not sink_info_p:
return
sink_info = sink_info_p.contents
print '-'* 60
print 'index:', sink_info.index
print 'name:', sink_info.name
print 'description:', sink_info.description
if sink_info.name == self.sink_name:
# Found the sink we want to monitor for peak levels.
# Tell PA to call stream_read_cb with peak samples.
print
print 'setting up peak recording using', sink_info.monitor_source_name
print
samplespec = pa_sample_spec()
samplespec.channels = 1
samplespec.format = PA_SAMPLE_U8
samplespec.rate = self.rate
pa_stream = pa_stream_new(context, "peak detect demo", samplespec, None)
pa_stream_set_read_callback(pa_stream,
self._stream_read_cb,
sink_info.index)
pa_stream_connect_record(pa_stream,
sink_info.monitor_source_name,
None,
PA_STREAM_PEAK_DETECT)
def stream_read_cb(self, stream, length, index_incr):
data = c_void_p()
pa_stream_peek(stream, data, c_ulong(length))
data = cast(data, POINTER(c_ubyte))
for i in xrange(length):
# When PA_SAMPLE_U8 is used, samples values range from 128
# to 255 because the underlying audio data is signed but
# it doesn't make sense to return signed peaks.
self._samples.put(data[i] - 128)
pa_stream_drop(stream)
def main():
monitor = PeakMonitor(SINK_NAME, METER_RATE)
for sample in monitor:
sample = sample >> DISPLAY_SCALE
bar = '>' * sample
spaces = ' ' * (MAX_SPACES - sample)
print ' %3d %s%s\r' % (sample, bar, spaces),
sys.stdout.flush()
if __name__ == '__main__':
main()
When running this demo, you'll need to modify SINK_NAME to match the name of the sink you want to monitor. If you're not sure of the sink, just run it - the program prints the details of all available sinks to stdout. If all goes to plan you should see a basic volume display in the console (when sound is actually playing!).
In this demo, the PeakMonitor class does all the interaction with the PulseAudio API. It needs the name of a sink to monitor and the sampling rate. Iterating over a PeakMonitor instance will give 8-bit peak level samples (actually they're from 128 to 255 - see comment in code and comment from Tanu on this article).
The main
function implements a simple volume display. Some of the
constants at the top of the code control how the volume level is
displayed.
As mentioned earlier, the PulseAudio API is asynchronous, making heavy use of callbacks. This goes for API calls that return information about the sound configuration of the system as well as API calls that return or accept actual sound data. The reason for this is that the available sound devices may change at any point as a program is running (e.g. when USB audio devices are connected) and PulseAudio clients may need to be able to handle such changes.
Here's how the demo program interacts with PulseAudio:
__init__
sets up the PulseAudio processing loop in a separate
thread, establishes a connection to the PulseAudio daemon and asks
PulseAudio to call the context_notify_cb
method with updates about
the status of the connection to PulseAudio.context_notify_cb
will be called several times by the PulseAudio
API as the connection is established. All things going to plan, the
connection will reach the PA_CONTEXT_READY
state. At this point,
we request that the sink_info_cb
method be called with information
about each available sink.sink_info_cb
is called once for each available sink. If a sink
with the name matching the one passed to __init__
is seen, a
stream is created on it's monitor source and stream_read_cb
is set
to be called with 8-bit peak level samples.stream_read_cb
is then called at the rate requested when the
PeakMonitor class was instantiated. pa_stream_peek
reads the
available sample(s) from the stream and pa_stream_drop
tells
PulseAudio that we're done with the stream data (presumably so the
buffer can be re-used or deallocated). The callback functions are
all called on the mainloop thread created by the PulseAudio API so
the _samples
Queue instance is used to safely return samples back
to the main thread (__iter
reads the samples from this queue).When running this demo on my fairly high-spec laptop it has no
noticeable impact on CPU utilisation, but it's another story when it
runs on the Raspberry Pi: around 25% of the CPU is required when the
monitor sampling rate is 344 Hz. The issue is that we've got PulseAudio
calling our Python function (stream_read_cb) at a siginficant
frequency and Python just isn't that fast on the Raspberry Pi's 700MHz
CPU. The pointer manipulation being done in stream_read_cb
, which
would be incredibily fast in C, is being done using a significant amount
of Python bytecode and function calls (partially because ctypes is being
used to do them).
I'm not too comfortable with constantly pegging the Raspberry Pi CPU at 25% as I'd like to eventually to have a always-running running daemon process driving the VU meter on the Raspberry Pi. It doesn't seem like a good idea to constantly have the CPU doing that much work. I plan on trying out Cython to convert the PulseAudio interation parts to C or rewriting the entire program in pure C since it's not really taking much advantage of Python's features. More on this to come.
I hope this article is useful to developers who are interested in using the PulseAudio API. Please note that I'm by no means a PulseAudio expert. Please let me know if there's anything I could be doing better and I'll update the article.
(This article is actually quite far behind where I'm actually at with the VU meter project. I actually already have a DAC connected to the GPIO ports, driving an analog VU meter using the sound going out the Raspberry Pi's audio outputs. I'm hoping to publish some more articles this week.)
Update (2013-02-14): I've updated the code and article to explain about why peak samples only range from 128 to 255. See Tanu's comment below too.
Update (2013-05-06): There's now a new article that describes how this code is used to actually drives a VU meter from the Raspberry Pi.
IMAPClient 0.9.2 was released yesterday. In this release:
See the NEWS file and manual for more details.
IMAPClient can be installed from PyPI (pip install imapclient
) or
downloaded from the IMAPClient site.
Note that the official project source repository is now on Bitbucket. http://imapclient.freshfoo.com/ is still the offical home page and is still used for project tracking. It is only the source respository that has moved.
IMAPClient 0.9.1 is out! In this release:
See the NEWS file and manual for more details.
As always, IMAPClient can be installed from PyPI
(pip install imapclient
) or downloaded from the IMAPClient
site.
A relatively unknown part of the Python standard library that I find myself using fairly regularly at work these days is the groupby function in the itertools module. In a nutshell, groupby takes an iterator and breaks it up into sub-iterators based on changes in the "key" of the main iterator. This is of course done without reading the entire source iterator into memory.
The "key" is almost always based on some part of the items returned by the iterator. It is defined by a "key function", much like the sorted builtin function. groupby probably works best when the data is grouped by the key but this isn't strictly necessary. It depends on the use case.
I've successfully used groupby for splitting up the results of large database queries or the contents of large data files. The resulting code ends up being clean and small.
Here's an example:
from itertools import groupby
from operator import itemgetter
things = [('2009-09-02', 11),
('2009-09-02', 3),
('2009-09-03', 10),
('2009-09-03', 4),
('2009-09-03', 22),
('2009-09-06', 33)]
for key, items in groupby(things, itemgetter(0)):
print key
for subitem in items:
print subitem
print '-' * 20
Here the dummy data in the "things" list is grouped by the first item of each element (that is, the key is the first element). For each key, the key is printed followed by the items returned by each sub-iterator.
The output looks like:
2009-09-02
('2009-09-02', 11)
('2009-09-02', 3)
--------------------
2009-09-03
('2009-09-03', 10)
('2009-09-03', 4)
('2009-09-03', 22)
--------------------
2009-09-06
('2009-09-06', 33)
-------------------
The "things" list is a contrived example. In a real world situation this could be a database cursor object or a CSV reader object. Any iterable object can be used.
Here's a closer look at what groupby is doing using the Python interactive shell:
>>> iterator = groupby(things, itemgetter(0))
>>> iterator
<itertools.groupby object at 0x95d3acc>
>>> iterator.next()
('2009-09-02', <itertools._grouper object at 0x95e0d0c>)
>>> iterator.next()
('2009-09-03', <itertools._grouper object at 0x95e0aec>)
You can see how a key and sub-iterator are returned for each pass through the groupby iterator.
groupby is a handy tool to have under your belt. Think of it whenever you need to split up a dataset by some criteria.
So I have this problem ... Well it's not really a problem - I can stop whenever I want, really I can. My problem is that I have a thing for tiling window managers (WMs)1. I love the efficient window management, keyboard focussed operation, extensive customisability and lightweight feel that most tiling WMs offer. For the X Window System 2 there's an awful lot to choose from, and I've been obsessed for some time now with finding a great tiling WM that works for me and then configuring it to perfection.
I've spent an embarrassing amount of time installing, learning about, configuring and tinkering with tiling WMs. Over the last 6 years I've tried out, with at least some seriousness, all of the following:
These tiling WMs all have their pros and cons but I've ended up settling3 on awesome (version 3) as it is stable (i.e. not someone's project that was never finished and hasn't been touched in years), highly customisable (almost everything can be changed using Lua), sane to configure with sensible defaults, handles multiple monitors well and lacks irritating bugs.
I use awesome both at home and at work, and until recently I'd been maintaining two separate configurations. After being frustrated by the process of repeatedly synchronising changes between the two configurations, I decided to merge the two and clean up a bunch of stuff at the same time.
The result can be seen on GitLab at: https://gitlab.com/menn0/awesome-config
My goals for the new configuration were:
Some highlights from my configuration:
Note that I'm by no means a Lua expert. I'd love to hear feedback on what I've done from more experienced Lua developers and awesome users.
If you're new to tiling window managers, I hope this post has piqued your interest (or at least hasn't put you off them!). If you're already using awesome, I hope I may have encouraged you to re-think your configuration. I know I'm obsessive about these things, but I think there's much to gain from optimising the way you interact with the software you use the most, and your window manager certainly meets that criteria.
Update: As requested, a screenshot of how my config looks (on my laptop).
The panel at the top has from left to right: a button for the application launcher menu, the tag (workspace) list, a window list for the current tag, CPU and memory usage graphs, a battery widget, volume widget, system tray (containing the Dropbox and Network Manager icons), a clock and a awesome layout indicator applet. Most/all of the UI elements are implemented in Lua.
Three application windows are shown in the tag, organised using the "tile" layout. Chrome in the "master" position with Emacs and a shell off to the right.
If you're unsure what tiling window managers are about, or why you'd considering using one, there are a few useful pages around.
I use Linux almost exclusively.
By "settled" I mean, "have used exclusively for about a year now".
The same goes for your editor configuration, but that's another story.
My Raspberry Pi arrived a couple of weeks ago and I've been working on turning it into a mini-audio server to connect to the home stereo in the living room.
As part of the project I'd like to drive an analog VU meter from the sound signal.
This week my (enthusiastic!) wife and I played around with attaching some basic electronics to the GPIO ports so that we could get more comfortable the Raspberry Pi. Our electronics knowledge is more than a little rusty but surprisingly everything we tried worked first go.
On the first night we attached a LED in line with a resistor directly to a GPIO port and were able to programmatically turn it on and off. Easily done.
Then we took that a little further by using a transistor to switch the LED from the Pi's 5V power supply. This is a better option because the circuit can be arranged so that a minimal current is pulled from the GPIO pin but a higher current can be put through the LED to give a nice bright light. The amount of current you can put through the GPIO pins without damaging the Pi is limited so this is a safer option (although not strictly needed for a single LED). There's an excellent page on elinux.org which explains this arrangment.
Here's the result (sorry about the dodgy video quality):
The Python code driving the LED looks like:
import itertools
import time
import RPi.GPIO as GPIO
# to use Raspberry Pi board pin numbers
GPIO.setmode(GPIO.BCM)
# set up GPIO output channel
GPIO.setup(17, GPIO.OUT)
pin_values = itertools.cycle([GPIO.HIGH, GPIO.LOW])
for pin_value in pin_values:
GPIO.output(17, pin_value)
time.sleep(1)
On the second night we duplicated this circuit 5 times to drive 5 LEDs:
We were quite chuffed that we managed to pack everything neatly into one end of the breadboard so that all the LEDs were in a line.
The plan now is to get PulseAudio working. It provides a nice way to intercept the sound going to the audio output. It should be possible to use that to drive these LEDs like the lights on a retro 80's hi-fi system.
And after comes driving an analog meter which will require digital-to-analog conversion either by using the PWM channels or an external DAC chip. More on that to come.
I'm pleased to announce version 0.9 of IMAPClient, the easy-to-use and Pythonic IMAP client library.
Highlights for this release:
The NEWS file and manual have more details on all of the above.
As always, IMAPClient can be installed from PyPI
(pip install imapclient
) or downloaded from the IMAPClient
site.
The main focus of the next release (0.10) will be Python 3 support as this is easily the most requested feature. Watch this space for more news on this.
The first release of my first Elisp project is out. The project is called "elemental" and the intro from the README goes:
elemental is a set of Emacs Lisp functions for intelligently jumping between and transposing list/tuple/dictionary/function-parameter elements. These functions are primarily useful when editing software source code.
It's probably easier to get an understanding what it does by demonstration so I've uploaded a quick screencast to Youtube.
The project is hosted on GitHub: https://github.com/mjs/elemental
Feedback and patches welcome!
Version 0.8.1 of IMAPClient has just been released. This version works around a subtle bug in distutils which was preventing installation on Windows from working. Thanks to Bob Yexley for the bug report.
This release also contains a few small documentation updates and packaging fixes. The NEWS file has more details.
As always, IMAPClient can be installed from PyPI
(pip install imapclient
) or downloaded from the IMAPClient
site.
Version 0.8 of IMAPClient is out! Although I didn't get everything into this release that I had hoped to, there's still plenty there. Thanks to Johannes Heckel and Andrew Scheller for their contributions to this release.
Highlights for 0.8:
python -m imapclient.interact ...
)The NEWS file and main documentation has more details on all of the above.
As always, IMAPClient can be installed from PyPI
(pip install imapclient
) or downloaded from the IMAPClient
site.
A quilt incorporating the starting numbers of the Fibonacci sequence. Designed by me, expertly crafted by Susanna and adorably used by Amelia.
My domain is up for renewal soon and I recently received a very official looking letter from a company called Domain Renewal Group. On the surface it looks like a renewal notice from my registrar but if you read more closely mention that the letter "isn't an invoice" and that if you return the form and pay them you'd be transferring the domain from your current registrar to them. Never mind that the price they're asking is almost 5 times higher than what my actual register wants for an annual renewal.
I bet this catches plenty of people out - that's what their business model depends on. Bottom feeding scum.
Plenty of others have blogged about these guys as well, but I thought I'd add my voice. As more people learn about these kinds of scams they become less effective.
The next version of IMAPClient was quietly released on the weekend. I've been pretty busy so I'm just getting to telling the world about it now.
This release is earlier then planned because IMAPClient is featured in the IMAP chapter of Brandon Rhodes' new edition of the book Foundations of Python Network Programming. Brandon had made several bug reports and feature requests while working on the book and some of the examples in the book relied on unreleased changes to IMAPClient. IMAPClient 0.7 works with book's examples.
What this does mean is that IMAPClient 0.7 doesn't have Python 3 support. I have been making headway with this however and with a little luck and a little more time, 0.8 should be Python 3 compatible.
Highlights for 0.7:
The NEWS document has more details on all the above.
Proper documentation is also on its way. I've been slowly making headway with Sphinx based documentation.
As always, IMAPClient can be installed from PyPI
(pip install imapclient
) or downloaded from the IMAPClient
site.
This release fixes some parsing issues when dealing with FETCH items that include square brackets (eg. "BODY[HEADER.FIELDS (From Subject)]") and includes the example when the package is installed using PyPI.
Also, IMAPClient now uses the excellent Distribute instead of setuptools.
As always, IMAPClient can be installed from PyPI
(pip install imapclient
) or downloaded from the IMAPClient
site.
In the pipeline now is better (Sphinx based) documentation, cleaning up the tests, OAUTH support and Python 3 support.
I've just released IMAPClient 0.6.1.
The only functional change in the release is that it now automatically patches imaplib's IMAP4_SSL class to fix Python Issue 5949. This is a bug that's been fixed in later Python 2.6 versions and 2.7 but still exists in Python versions that are in common use. Without fix this you may experience hangs when using SSL.
The patch is only applied if the running Python version is known to be one of the affected versions. It is applied when IMAPClient is imported.
The only other change in this release is that I've now marked IMAPClient as "production ready" on PyPI and have updated the README to match. This was prompted by a request to clarify the current status of the project and seeing that all current functionality is solid and, I don't plan to change the existing APIs in backwards-incompatible ways, I've decided to indicate the project as suitable for production use.
As always, IMAPClient can be installed from PyPI
(pip install imapclient
) or downloaded from the IMAPClient
site. Feedback, bug reports and
patches are most welcome.
IMAPClient 0.6 is finally out! Highlights:
Be aware that there have been several API changes with this release. See the NEWS file for further details.
IMAPClient can be installed from PyPI (pip install imapclient
) or
downloaded from the IMAPClient site.
As always, feedback, bug reports and patches are most welcome.
Special thanks goes to Mark Hammond. He has contributed a significant amount of code and effort to this release. Incidentally, Mark is using IMAPClient as part of the backend for the Raindrop Mozilla Labs project.
At my employer we are in the process of migrating from Python 2.4 to 2.6. When running some existing code under Python 2.6 we started getting DeprecationWarnings about "object.__new__() takes no parameters" and "object.__init__() takes no parameters".
A simple example that triggers the warning:
class MyClass(object):
def __new__(cls, a, b):
print 'MyClass.__new__', a, b
return super(MyClass, cls).__new__(cls, a, b)
def __init__(self, a, b):
print 'MyClass.__init__', a, b
super(MyClass, self).__init__(a, b)
obj = MyClass(6, 7)
This gives:
$ python2.4 simple-warning.py
MyClass.__new__ 6 7
MyClass.__init__ 6 7
$ python2.6 simple-warning.py
MyClass.__new__ 6 7
simple-warning.py:5: DeprecationWarning: object.__new__() takes no parameters
return super(MyClass, cls).__new__(cls, a, b)
MyClass.__init__ 6 7
simple-warning.py:9: DeprecationWarning: object.__init__() takes no parameters
super(MyClass, self).__init__(a, b)
It turns out that a change to Python for 2.6 (and 3) means that object.__new__ and object.__init__ no longer take arguments - a TypeError is raised when arguments are passed. To avoid breaking too much pre-existing code, there is a special case that will cause a DeprecationWarning instead of TypeError if both __init__ and __new__ are overridden. This is the case we were running into with our code at work.
The reason for this change seems to make enough sense: object doesn't do anything with arguments to __init__ and __new__ so it shouldn't accept them. Raising an error when arguments are passed highlights code where the code might be doing the wrong thing.
Unfortunately this change also breaks Python's multiple inheritance in a fairly serious way when cooperative super calls are used. Looking at the ticket for this change, this issue was thought about but perhaps the implications were not fully understood. Given that using super with multiple inheritance is common and "correct" practice, it would seem that this change to Python is a step backwards.
Let's look at an even simpler example, one that triggers a TypeError under Python 2.6.
class MyClass(object):
def __init__(self, a, b):
print 'MyClass.__init__', a, b
super(MyClass, self).__init__(a, b)
obj = MyClass(6, 7)
The output looks like:
$ python2.4 simple.py
MyClass.__init__ 6 7
$ python2.6 simple.py
MyClass.__init__ 6 7
Traceback (most recent call last):
File "simple.py", line 7, in <module>
obj = MyClass(6, 7)
File "simple.py", line 5, in __init__
super(MyClass, self).__init__(a, b)
TypeError: object.__init__() takes no parameters
The fix would seem to be to not pass the arguments to object.__init__:
class MyClass(object):
def __init__(self, a, b):
print 'MyClass.__init__', a, b
super(MyClass, self).__init__()
obj = MyClass(6, 7)
On the surface, this deals with the issue:
$ python2.6 simple-fixed.py
MyClass.__init__ 6 7
But what about when multiple inheritance is brought into the picture?
class MyClass(object):
def __init__(self, a, b):
print 'MyClass.__init__', a, b
super(MyClass, self).__init__()
class AnotherClass(object):
def __init__(self, a, b):
print 'AnotherClass.__init__', a, b
super(AnotherClass, self).__init__()
class MultiClass(MyClass, AnotherClass):
def __init__(self, a, b):
print 'MultiClass.__init__', a, b
super(MultiClass, self).__init__(a, b)
obj = MultiClass(6, 7)
Things don't go so well:
$ python2.6 problem.py
MultiClass.__init__ 6 7
MyClass.__init__ 6 7
Traceback (most recent call last):
File "problem.py", line 21, in <module>
obj = MultiClass(6, 7)
File "problem.py", line 19, in __init__
super(MultiClass, self).__init__(a, b)
File "problem.py", line 5, in __init__
super(MyClass, self).__init__()
TypeError: __init__() takes exactly 3 arguments (1 given)
Not passing the arguments to the super call in MyClass.__init__ means that AnotherClass.__init__ is called with the wrong number of arguments. This can be "fixed" by putting the arguments to the __init__ calls back in but then these classes can no longer be used by themselves.
Another "fix" could be to remove the super call in MyClass.__init__ -after all, object.__init__ doesn't do anything. This avoids the TypeError but also means that AnotherClass.__init__ isn't called at all! This isn't right either.
What is the correct way to deal with this situation in Python 2.6+? Wise developers avoid multiple inheritance where possible but it is sometimes the best solution to a problem. If Python claims to support multiple inheritance then it should do so in a way that the language feature that accompanies it (super) works. This change to the behaviour of object.__init__ and object.__new__ means that super can't really be used with class hierarchies that involves multiple inheritance.
Should this get fixed or am I missing something?
(Thanks to Christian Muirhead for helping with investigations into this issue)
I just scratched an itch by writing a small plugin for PyBlosxom that allows the rst (reStructured Text) and readmore plugins to work together1. It defines a reST "break" directive which gets transformed into the breakpoint string the readmore plugin looks out for. This allows for "Read more..." breaks to be inserted in for reST based articles.
For further information see the Code page here and at the top of the plugin itself.
Yes, the audience for this plugin is probably tiny!
IMAPClient 0.5.2 has just been released. This release fixes 2 bugs (#28 and #33). Much thanks to Fergal Daly and Mark Eichin for reporting these bugs.
Install from the tarball or zip or upgrade using easy_install or pip.
Here's something I just learned the hard way.
If you edit a crontab with "crontab -e", cron won't reload the updated crontab immediately. Changes will be read at 1 second past the next minute boundary. For example, if you change the crontab at 10:54:32, cron will reload it at 10:55:01. This means if you're trying to test how something runs under cron and you're impatient so you set that thing to run at the next minute, you won't see it run!
I spent a good half hour chasing my tail on this one. Set the test entry to run 2 minutes ahead instead.
I've had several requests over the last few weeks to open up the IMAPClient Trac instance so that anyone can submit tickets. Initially I changed access levels so that anoymous users could create tickets. This turned out to be fairly inflexible: it doesn't allow people to add attachments or modify tickets later. This approach also resulted in one strange ticket being created where all fields were filled with random characters - a bot looking for buffer overruns?
Since then, I've disabled anonymous ticket creation and have set up the fantastic AccountManagerPlugin which allows people to register accounts for themselves. Once someone has created an account and logged in they can create and modify tickets. I have a feeling I'm going to have to turn on the optional CAPTCHA support, but I'm willing to see how it goes for a while first.
I've just made a quick point release of IMAPClient. Mark Hammond is interested in using it for a project he's working on but the licenses (GPL and MPL) were incompatible. I was thinking about relaxing the license of IMAPClient anyway so this presented a good opportunity to make the switch.
Work on the 0.6 release is coming along. This version will fix a number issues with parsing of FETCH responses - the FETCH parsing code is being completely rewritten. This is the first time that IMAPClient will bypass most of imaplib for some functionality. It's looking like that at some point IMAPClient may not use imaplib at all.
IMAPClient can be installed from PyPI using pip install IMAPClient
or
easy_install IMAPClient
. It can also be downloaded from the
IMAPClient project page. As always,
feedback and patches are most welcome.
I've had a few requests for the little hack I created to import comments from PyBlosxom into Disqus. A cleaned up version of disqus-import.py is now on the Code page. There's some docs at the top of the file.
I recently sorted out an issue with the IMAPClient Trac instance that's been bugging me for a while.
The problem was that whenever the web server logs were rotated logrotate would restart Lighttpd. The web server restart would in turn restart the Trac (FastCGI) processes. Unfortunately, the Trac processes would fail to start with the following error.
pkg_resources.ExtractionError: Can't extract file(s) to egg cache
The following error occurred while trying to extract file(s) to the Python egg
cache:
[Errno 13] Permission denied: '/root/.python-eggs'
The Python egg cache directory is currently set to:
/root/.python-eggs
Bang, no IMAPClient web site (the
rest of the site was ok). To band-aid the problem when it happened (and
I noticed!) I issue a sudo /etc/init.d/lighttpd restart
and everything
would be fine again.
After some investigation I found that running
/etc/init.d/lighttpd restart
as root always triggered the problem
where-as restarting using sudo always worked. My guess is that
restarting when logged in as root was leaving $HOME at /root even after
Lighttpd had dropped to its unprivileged user account. The unprivileged
user isn't allowed to write to /root so Trac blows up. setuptools seems
to use $HOME instead of looking up the actual home directory of the
current user.
The fix for me was to set the PYTHON_EGG_CACHE environment variable for the FastCGI processes to somewhere they are allowed to write to. This is done with the bin-environment option if you're using Lighttpd like me.
I imagine similar problems can happen with any Python app deployed using FastCGI.
I recently moved my blog comments away the PyBlosxom comments plugin to a hosted system. The main driver was the ability for people to subscribe to comments for an article using email or RSS. It's a pain for people to have to check back to the site to see if someone has replied to their comments. I was also keen on user-experience-enhancing features such as integration with external systems like OpenID, Twitter and Yahoo.
My criteria were:
There are a number of hosted comment systems out there. The most popular options seem to be Disqus, JS-Kit Echo and IntenseDebate.
IntenseDebate was eliminated first because it doesn't seem to provide an import mechanism for custom websites. Import only seems to be supported for well known blog platforms such as Wordpress. There's no comment API either. The approach seems to be to leave your old comment system in place and just have new comments go into IntenseDebate. Not good enough, I wanted to completely replace the existing comments system.
After some deliberation I decided on JS-Kit Echo for one tiny reason: it supports the <pre> tag. The closest Disqus supported was the <code> tag which doesn't preserve white-space (useless for Python code samples).
So I paid my US$12 (it's the only service that doesn't have a free option) and started looking at how to import my existing comments using their API ... and quickly found that it sucks. Comments can be submitted but you can't specify a timestamp so they are dated with the import date. Far from ideal. Then there's the API for retrieving comments: it returns the data as JavaScript code (no not JSON)! It's pretty clear that the API is what they use with the JavaScript for Echo itself and geared for that use only. They've just thrown it out there and documented it, warts and all.
Back to the drawing board.
The only showstopper for Disqus was the lack of <pre>. Everything else about it was great: it met all my requirements and the API was clean and comprehensive. If only there was a way to have properly formatted source code in the comments.
Light bulb moment: use a CSS hack to make <code> in comments behave like <pre>. The trick is to turn code into a block element and change how white-space is handled. The CSS snippet looks like:
.dsq-comment-message code {
display:block;
white-space:pre;
}
Works great.
With the only blocker gone, I wrote a Python script with the help of Ian Lewis' excellent disqus-python-client package to pull in the existing comments from the old system. Within an hour or so it was ready to go.
Hopefully this article saves someone else some time if they decide to use one of these systems. Getting things running chewed up a lot more time then I had expected.
I've just (finally) finished setting up a proper website for IMAPClient. The new home for the project is http://imapclient.freshfoo.com/.
It's a Trac instance with Mercurial support that monitors the main trunk repository. All items from the TODO file in the source have been converted to tickets in the bug tracker. I've even created a hokey little logo.
Let me know me know if anything looks broken.
Time to work on some long-standing bugs...
I just finished configuring a Trac instance for IMAPClient. To keep page response times tolerable I'm using FastCGI.
It turns out there's a little gotcha when you serve an app at the root (ie. "/") using FastCGI with Lighttpd. It passes the wrong SCRIPT_NAME and PATH_INFO to the app causing unexpected behaviour. The problem isn't there if the app is served from a prefix (eg. "/trac").
The problem is apparently fixed somewhere in Lighttpd 1.5 but I'm on 1.4.13 as shipped with Debian.
With Trac there is a workaround. If you modify Trac's FastCGI runner (fcgi_frontend.py) to correct the SCRIPT_NAME and PATH_INFO environment variables before the rest of the app sees them, the problem is solved. Here's what my fcgi_frontend.py now looks like (header comments removed):
import pkg_resources
from trac import __version__ as VERSION
from trac.web.main import dispatch_request
from os import environ
import _fcgi
def application(environ, start_request):
environ['PATH_INFO'] = environ['SCRIPT_NAME'] + environ['PATH_INFO']
environ['SCRIPT_NAME'] = ''
return dispatch_request(environ, start_request)
def run():
_fcgi.WSGIServer(application).run()
if __name__ == '__main__':
pkg_resources.require('Trac==%s' % VERSION)
run()
Figuring this out cost me lots of time. Hopefully this information helps someone else out.
Relevant links:
Our relationship has only just started but it's already time for me to move on. Although I think you're great, Mecurial has become my new best friend. Things with her are just a little easier. We just clicked.
I get along so well with her friends; I never did quite fit in with yours. They always seemed somewhat immature.
It's also hard to ignore recent developments. It seems like Mercurial is going to become quite important in the future.
No hard feelings I hope. I truly wish you all the best for the future.
Menno
One of my pet peeves, especially with Python code, is trailing whitespace. It serves no purpose, introduces noise in diffs and wastes valuable bytes dammit (yes I'm being pedantic).
In order to see make trailing whitespace visible in Emacs you can use the show-trailing-whitespace variable. Emacs of course has a command to remove trailing whitespace: delete-trailing-whitespace.
Better yet, to get rid of trailing whitespace automatically on save you can add a function to the write-contents hook. The following snippet will cause trailing whitespace to be removed on save, but just for Python files.
;; Automatically remove trailing whitespace when saving Python file
(add-hook 'python-mode-hook
(lambda () (add-hook 'write-contents-hooks 'delete-trailing-whitespace t)))
The one problem with doing this is that when changing an existing codebase your commits could end up with many whitespace deltas, making it difficult to see the meat of your changes. Use with care.
I've just published a simple little PyBlosxom plugin called draftsdir. It solves a personal itch for me: I'd like to review draft blog entries on the site before they go live. The plugin lets you define a directory for draft entries which aren't shown unless you add a query arg to the URL (the default arg name is "drafts" but there's an option to change it). When you're happy with an entry you move it from the drafts directory to where your other published entries are. Simple.
For more details see the Code section of this site, the bzr repo or download it directly. There install and configuration instructions at the top of the file.
I've been a long time Vim user and fan. Once you're used proficient with the Vi way of editing it's difficult go back to anything else. Other editors just feel inefficient and clunky.
That said, I've been jealous of certain Emacs features that I've seen while looking over the shoulders of my colleagues. Multiple frames (GUI windows), the clean extension system (via elisp), the tight process integration (shell, SQL, debuggers etc) and all sorts of unexpected bells and whistles; these goodies slowly ate away at my curiosity.
So 2 weeks ago I caved and decided to give Emacs another go. I hadn't really used it since university. Starting with my colleague's configuration and making some minor tweaks, I began using Emacs for serious work.
A few days in and I was enjoying exploring my new toy but it didn't feel quite right. Although I had a reasonable grasp of Emacs' editing keys and commands, most tasks took way too long, requiring convoluted hand gymnastics. My left pinky was permanently sore from constantly reaching for the Ctrl and Alt keys. I was missing those surgical, efficient Vi commands.
At the suggestion of one Emacs-using colleague I gave Viper (viper-mode) a try. It's an emulation that attempts to provide a fairly accurate Vi experience while still allowing full access to the rest of Emacs. To be honest I was expecting it to be a half-assed kludge. I was wrong. Viper is a mature Emacs extension and it does a great job of mitigating conflicts between the Emacs and Vi ways of doing things.
Viper mode proved to be the tipping point; because of it I'm sticking with Emacs. As far as I'm concerned it's the best of both worlds.
For anyone who's interested, my Emacs config is available in the repository browser here or via bzr. This is my personal configuration branch so it will update as I make changes to the setup. Note that I'm using the latest development (but rock solid) Emacs so there might be references to features in the config which only exist in this version.
Some items of note in my config:
vimpulse: Viper mode only emulates classic Vi. vimpulse provides a bunch of extra features which a Vim user will probably miss such as various "g" commands and visual select mode.
ido-mode: This is a standard feature of Emacs which isn't bound to keys by default. It gives amazing power by replacing the standard find-file and switch-buffer keystrokes with beefed up alternatives. The key features are quick directory switching and fuzzy, recursive name matching (but that's not all).
ibuffer: I've replaced the standard buffer list binding (C-x C-b) with ibuffer. This gives a more powerful and easier to use version of the standard buffer list and allows for crazy batch manipulation of buffers.
yasnippet: Mode specific template expansion. Powerful and super useful for cranking out commonly used sections of text (programming is full of them).
flymake - pyflakes integration: Flymake runs arbitrary programs over buffers on-the-fly. For Python files flymake has been configured to run pyflakes and highlight errors in code as I type. I might change this to use pylint at some stage because pylint finds a wider range of problems.
Some useful Emacs config hacking links:
I've just made my personal bzr repositories publically available so that anyone can easily get to them (including me!) and so I can refer to things from blog articles. The repos are available for branching using bzr under http://freshfoo.com/repos/ and in human browseable form. See also the links in the left sidebar and in the code section of the site.
I'm using Loggerhead to provide the web viewable form (proxied via the main lighttpd server). It was very easy to setup (using serve-branches). I just wrote a simple init.d script to ensure it stays running.
It's been a long time between releases but I've just released IMAPClient 0.5.
From the NEWS file:
Folder names are now encoded and decoded transparently if required (using modified UTF-7). This means that any methods that return folder names may return unicode objects as well as normal strings [API CHANGE]. Additionally, any method that takes a folder name now accepts unicode object too. Use the folder_encode attribute to control whether encode/decoding is performed.
Unquoted folder names in server responses are now handled correctly. Thanks to Neil Martinsen-Burrell for reporting this bug.
Fixed a bug with handling of unusual characters in folder names.
Timezones are now handled correctly for datetimes passed as input and for server responses. This fixes a number of bugs with timezones. Returned datetimes are always in the client's local timezone [API CHANGE].
Many more unit tests, some using Michael Foord's excellent mock.py. (http://www.voidspace.org.uk/python/mock/)
IMAPClient can be installed from PyPI using easy_install IMAPClient
or
downloaded from my Code page. As always,
feedback and patches are most welcome.
I'm at PyCon 2009 in Chicago. So much awesome stuff has already been discussed and we're only a few hours in. Much of it very relevant for my employer (which is nice since they're paying).
Some random thoughts so far:
The conference seems really well run and the space at the hotel is great. The only problem I (and many others) have noticed so far is that the wifi can be a tad unreliable. Not suprising given the huge number of wireless devices around.
Well that was a bit scary. I've just updated the firmware on my iPhone to 2.2 (the horrors!). Actually, this normally shouldn't be too much of a trial. The difference here is that I did it from Windows running inside VMWare on a Linux host.
I initially tried the naive approach of applying the update normally using iTunes (but running inside VMWware). Don't do this! The host (Linux) USB subsystem gets in the way leaving you with a somewhat useless iPhone in "recovery mode". It seems that the iPhone firmware upgrade procedure does something tricky with USB that doesn't play nicely with VMWare running on Linux.
To workaround the issue, some USB modules have to not be loaded on the Linux host during the upgrade. Extending the tips on Stephen Laniel's blog I created a blacklist file for modprobe that prevents certain USB modules from loading. Just unloading the modules manually isn't enough. The iPhone reconnects to the USB bus at least once during the upgrade process causing udev to reload the modules you've manually unloaded.
So before attempting the upgrade, create a file named something like /etc/modprobe.d/blacklist-usb with the following contents:
blacklist snd_usb_audio
blacklist usbhid
blacklist ehci_hcd
I'm not sure if ehci_hcd needs to be disabled, but I did this in my case.
Reload udev so that it knows about the new modprobe config file:
sudo /etc/init.d/udev reload
Now make sure these modules aren't already loaded:
sudo /sbin/modprobe -r snd_usb_audio
sudo /sbin/modprobe -r usbhid
sudo /sbin/modprobe -r usbhid
Now go back to VMWare, cross your fingers and attempt the firmware upgrade using iTunes. Things should progress along slowly.
You might find that at one point during the upgrade that iTunes sits there with a dialog saying "Preparing iPhone for upgrade". If this goes on for several minutes it could be that the iPhone isn't connected to VMWare anymore. iTunes is just waiting for the device to appear. If this happens, reattach the iPhone to VMWare using the "Removable Devices" option on VMWare's VM menu. It's a good idea to occasionally check that the iPhone is connected to VMWare during the upgrade.
Once the upgrade is complete you can remove the modprobe config file and reload VMWare: :
sudo rm /etc/modprobe.d/blacklist-usb
sudo /etc/init.d/udev reload
For the record, this was done using iTunes 8 running on Vista running inside VMware Workstation 6.5.0 build-118166 on Ubuntu Intrepid (8.10).
So new employer has given me an iPhone and despite my dislike of vendor lock-in I really like it.
One thing I've now managed to achieve is a single, central set of contact data. Like most geeks I had people's email addresses, phone numbers and postal addresses scattered across various devices and email profiles. In my case this included a Nokia 6130, my laptop's Thunderbird profile, various work Thunderbird profiles, my Gmail account and my trusty old Palm Tungsten E2. None of these stores of contact data were being synced. Contacts were added haphazardly as required.
No longer. The mess has been solved through the use of a variety of tools.
Getting an iPhone was the tipping point. Suddenly it was possible to sync contact (and calendar) data over the air, without needing to connect to a PC. This is very attractive to me. I like being able to sync even when away from the computer where iTunes is installed. I'm using the excellent (and free) NeuvaSync service for this. It hijacks the iPhone's Microsoft Exchange support to push changes to contact data to and from Google Contacts.
I've chosen Google Contacts as the central storage point for this data because it's reliable, I already have a populated gmail account and (mainly) because of the large number of tools that can interface with it.
Syncing to the various Thunderbird instances I use is done using the Zindus extension. This syncs Thunderbird's address book with Google Contacts. It works fairly well as long as you remember to stick to Google's rules about contact entries (eg. email addresses must be unique across all contacts). It will warn you if there's a problem.
So, that covers syncing contact data to everywhere I need it: iPhone, Gmail and Thunderbird.
The final challenge was to pull together all the contact entries I had around the place and merge them into Google Contacts. This was a long and messy process that was helped by some adhoc Python hacking. To get the data out of my Palm I used the pilot-address tool that ships with the pilot-link package. The excellent Gnokii tool pulled the contacts out of my Nokia (over Bluetooth, no less!). I wrote some Python scripts to help merge the highly imperfect sets of data, trying to make intelligent guesses where possible. Finally, OpenOffice Calc was used to manually fix the CSV before importing into Google Contacts.
It took a long time and many editing iterations but I now have a clean, centralised set of contacts. Say what you like about me needing to get out more, but I think this is awesome (and it was a worthwhile learning process).
I've just posted an article on my experiences with Linux on my shiny new Sony Vaio VGN-BZ11XN notebook. Hopefully it's useful to other Linux users who own or are considering purchasing this laptop.
Questions, updates and feedback most definitely welcome.
I've now converted 2 popular articles1 on freshfoo.com so that they are generated via Michael Foord's excellent rest2web. Previously one of these articles was in plain HTML and the other was on the site wiki. I've got a few other articles in the works so I decided it was time to migrate the articles to a clean, consistent format. reStructured Text fits the bill perfectly.
I recently also started using the rst entryparser for PyBlosxom so now most of the site is generated using reStructured Text.
according to the HTTP server logs
Glenn and I discovered markup.py today. It's a lightweight, simple Python module for generating HTML and XML. Tags are generated using Python objects, with calls to methods on those objects dynamically creating child tags (using __getattr__ trickery). A useful feature is that repeated tags of the same type can be blatted out by passing sequences of values.
The module provides a natural way to create correct HTML/XML from Python and suited our needs perfectly. My only complaint is that the output isn't indented so it can be harder to read than it needs to be. It would probably be straightforward to add such an option. Also, it would be nice if the output could be streamed to a file-like object. Currently it builds the output in memory which could be a potential problem for large pages.
markup.py is worth adding to your toolbox if you ever have to generate bits of XML from Python. For simple jobs, it beats using a complex XML library and is less error prone than plain old string interpolation.
PyCon UK was fantastic. Many thanks to the organisers for such a great event.
As promised at the end of the talk, I've just put the slides and code online for my Python on Openmoko presentation. I'm hoping this stuff is useful to anyone looking to develop Python apps for Openmoko phone.
PyCon UK 2008 is approaching fast. If you're a Python programmer in the UK (or are just Py-curious) then you really should be going. The talk schedule looks hugely exciting and the in-the-hallways and at-the-pub action will undoubtably be fun and engaging. I guarantee you'll learn something about Python and come away feeling inspired.
Disclaimer: I'm presenting :) I'll be doing a talk on Python on the Openmoko Freerunner as well as helping out Michael and Christian with a 4 hour tutorial: Developing with IronPython
I finally got my hands on my Neo Freerunner two weeks ago and have been playing with it when time allows (so much so that I haven't given myself time to blog about it).
Overall, the hardware is great. The first thing you notice is that the unit feels very solid and the quality of the display is excellent; bright and high resolution. I've had success with wifi, GPS, the SD card slot and basic GSM usage. I haven't had a change to try out the accelerometers yet, mainly due to a (surprising) lack of software that uses them.
The Freerunner is a fantastic device but the software is still very much a work in progress. I wouldn't advise anyone to get one unless they're a developer or a Linux geek. This will change though as the software (rapidly) matures. Things get better each day as updates are released.
Perhaps best of all, Python is readily installed and there's Python API's to access the hardware and GUI toolkits. Python also forms a crucial part of the in-development FreeSmartphone.Org (FSO) software stack which will one day form the standard base system for Openmoko phones. I'll be presenting a talk on Python and Openmoko at PyCon UK in September.
The Neo Freerunner is uber exciting and reeks of potential. I'm really looking forward to seeing where the community takes it and contributing where I can. Watch this space.
ps. SSHing to your phone is neat.
pps. Setting up your phone to dual boot between different images while
waiting in the doctor's waiting room is really really neat :)
The OpenMoko Freerunner has been released! This is big news for people who'd like an open and free phone (running Linux) with some interesting hardware: GSM, GPRS data, WiFi, GPS, accelerometers, USB host support, accelerated graphics, SD card slot and much more. The software is still a work in progress so the phone is primarily for developers at this stage
The UK distributor has been swamped by the amount of people interested in buying one. This is certainly an encouraging for the potential success of the project. I'm on the list to get one in the next batch, fingers crossed.
At Resolver, we've been looking at better ways of dealing with unhandled exceptions that occur during test runs. Apart from the need to log that a problem occurred it is important that the dialog boxes that Windows generates don't block the test run (ideally they wouldn't appear at all). We had a hack in place to deal with these dialogs that I won't go into here. Let's just say we've been finding our hack inadequate.
In the .NET world there's 2 APIs that your program can use to be notified about unhandled exceptions. Each covers exceptions that happen in different parts of your code. In order to be comprehensive about catching unhandled exceptions you really need to use both APIs.
The first is the Application.ThreadException event. Despite what the name seems to indicate, this catches unhandled errors that occur during normal execution of Windows Forms applications. That is, in event handlers once Application.Run has been called.
Here's a quick example of how to use it from IronPython.
import clr
clr.AddReference('System')
clr.AddReference('System.Windows.Forms')
from System.Threading import ThreadExceptionEventHandler
from System.Windows.Forms import Application, UnhandledExceptionMode, Form, Button
def handler(sender, event):
print 'Oh no!'
print event.Exception
Application.Exit()
Application.ThreadException += ThreadExceptionEventHandler(handler)
Application.SetUnhandledExceptionMode(UnhandledExceptionMode.CatchException)
def die(_, __):
raise ValueError('foo')
form = Form()
button = Button(Text='Click me')
button.Click += die
form.Controls.Add(button)
Application.Run(form)
The second API is AppDomain.CurrentDomain.UnhandledException. This event is triggered when unhandled exceptions occur in threads other than the main thread. There is one caveat with using this event: Windows will still pop up its own dialog box even if you've installed your own handler! This is somewhat frustrating when you want to run unattended test runs as we do at Resolver. Our builds would block until someone comes along and closes the dialog.
It seems that many others face the same problem and there's no good solutions reported online. The only option I could find was to write some C++/C# that bypasses the .NET handler by using the SetUnhandledExceptionFilter Win32 call. The CLR uses the same underlying Win32 API to do its unhandled exception handling so by installing your own handler here you can prevent the .NET handler from firing and prevent the dialog box from appearing. The problem with this approach is that you probably don't have access to a useful CLR traceback.
This morning it occurred to me that perhaps if the process terminates during the unhandled exception handler then the CLR won't have the opportunity to show its own dialog. We had tried Application.Exit() without success (the dialog still appears) but terminating the current process with a Kill() did the trick! Here's how the code looks...
import clr
clr.AddReference('System')
clr.AddReference('System.Drawing')
from System import AppDomain, UnhandledExceptionEventHandler
from System.Diagnostics import Process
from System.Threading import Thread, ThreadStart
def handler(sender, event):
print 'AppDomain error!'
print event.ExceptionObject
Process.GetCurrentProcess().Kill()
AppDomain.CurrentDomain.UnhandledException += UnhandledExceptionEventHandler(handler)
def die():
raise ValueError('foo')
t = Thread(ThreadStart(die))
t.Start()
Thread.Sleep(5000)
It's ugly but it works. I'd love to know if there's a better way of dealing with this.
Hopefully this helps some people out there who have struggled with the same issue. Although these examples are in IronPython, the principles should easily translate to C# and other .NET languages.
As promised at the end of my talk, I just uploaded my slides and sample code from PyCon Italia. Included are the S5 slides, a simple Resolver One sample, the IronPython shell example and code for the demo program, mp3stats.
Thanks to Michael Foord for the basis of much of the slide content.
I'm enjoying the wonderful weather in Florence this weekend while attending PyCon Italy 2. Yesterday's highlight was Richard Stallman's thought-provoking keynote titled Free Software in ethics and in practice held in the jaw-dropping Palazzo Vecchio. Stallman's alter-ego Saint IGNUcious (of the church of Emacs) even made an appearance.
My presentation on Sunday covers Application Development in IronPython. It's mainly an introduction to IronPython for Python programmers. Being very much an an Italian language conference, there's real-time translation of English presentations to Italian (mine included). Conversely there's translation from Italian to English in one stream.
The conferences organisers and attendees are being patient with my lack of Italian language skills. I feel very lucky that many Italians can speak English. It's easy to be complacent about learning other languages when you already know English.
I've been fortunate to have met some great people including Raymond Hettinger (core Python team), Arkadiusz Wahlig (Skype4Py) and some of the organisers Simone, Giovanni and Lawrence. It's always great to be able talk shop, exchange perspectives and be inspired. (Resolver One has been getting plugged too!) I'm looking forward to more conversations as the conference continues. The best stuff at conferences always happens outside of the lecture theatres.
ps. The food rocks! The conference lunch today was amazing and last night's Florentine style steak was super-tasty.
As mentioned by several others, Resolver One 1.0 was released on Wednesday!
For those of you who don't know, Resolver One is a unique spreadsheet application written in IronPython, that allows you to easily add functionality using the Python language. The entire spreadsheet is represented as a IronPython program which can be eyeballed and extended. Changes to the spreadsheet grid are reflected in the Python code and changes to the code are reflected on the spreadsheet grid. It's really neat. Resolver is great for people who want to develop complex spreadsheets in a clean, powerful and testable way. It's also useful (and fun!) for programmers when prototyping.
One of the Resolver developers Andrzej, has written a nice article describing 5 Reasons To Try Resolver One. Of course there are more than just 5 reasons :) If you do try out Resolver One, make sure to check out the Resolver Hacks site too.
The 1.0 release of Resolver One is the result of over 2 years of work. Having joined the team very recently I've only been part of the very last bit of that. One thing I've found really interesting is the time leading up to the release date. Compared to other commercial software projects I've worked on the atmosphere felt under-control and (almost) relaxed. It seems that the XP development practices used by the development team are really paying off. No horrible integration issues right before the release, no unexpected bugs on release day, no huge schedule blow-outs. Such a nice difference.
Susanna and I spent the Christmas and New Year period in New York City staying with the the wonderful Libby and Phillip. Libby is a long time friend of Susanna's from New Zealand who's been studying in NYC for many years now.
On one of our first mornings in NYC we were walking out the door of Good Enough to Eat after a delicious brunch when I hear "Menno?". By freak coincidence Seth Vidal and his partner in crime Eunice were walking in the door at the same time as we were leaving. Seth and I used to work together a lot on the Yum project. I don't live in NYC and neither does he, but some how we end up at the same place in a huge city at the same time. A super weird but pleasant surprise. Seth describes the incident on his blog.
The remainer of our visit didn't offer any more strange co-incidences but we had excellent fun. I caught up with Rohan, Susan and Jon one afternoon which was awesome. Thanks Rohan for showing us some sights (McSorely's is a must-do experience).
As always, photos to come...
I've just released IMAPClient 0.4. Changes are:
Many thanks to Helder Guerreiro for contributing patches for some of the features in this release.
IMAPClient can be installed from PyPI using easy_install IMAPClient
or
downloaded from my Code page. As
always, feedback and patches are most welcome
I released version 0.2 of the spamquestion plugin last night. It now conforms to the requirements for a PyBlosxom plugin. The code is much cleaner too. Available from the usual place.
Well, two comment spams have made it past the spamquestion plugin. This makes me wonder if either the submissions were done manually or whether the software the spammers use is at least human assisted. I guess it's also possible that the spam software is so good that it can automatically work out my simpler arithmetic questions.
The web server logs give some clues. There's literally hundreds of obviously automated POST attempts to various pages on my blog. The requested related to the two comments that made it through however seem far more human however. Here's one example:
68.187.226.250 freshfoo.com - [03/Nov/2007:01:41:57 +0000] "GET /blog/Holland_photos_online.1024px HTTP/1.1" 200 11367 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
68.187.226.250 freshfoo.com - [03/Nov/2007:01:42:06 +0000] "POST /blog/Holland_photos_online#comment_anchor HTTP/1.1" 200 14928 "http://freshfoo.com/blog/Holland_photos_online.1024px" "Mozilla/4.0 (compatible; MSIE 6.0; Windows
NT 5.1; SV1)"
These are the only two HTTP requests made for the first spam that made it through; no dumb, repeated automatic requests like some of the other attempts in the logs. Notice how the parent page was visited first and then 9 seconds later the POST was made. That's pretty quick for someone to fill out the form manually but it's possible, especially if the spam body was ready in the clipboard. If their system is partially automated then the short delay is even more plausible.
To test whether some spambots are actually capable of doing simple arithmetic by themselves, I've removed all the addition and subtraction questions from my spamquestion configuration and have added more questions that are harder to answer programmatically. If the spam continues, then I'm going to conclude that there's definitely some human assistance going on. If it stops, then it's more likely that the spambot software was actually able to solve some of my arithmetic questions itself.
I also need to look at is short-term blocking of spamming IPs. When examining my logs I found there had been almost 500 comment spam attempts for just today! I'd rather not be dealing with that bandwidth on my server. Dropping all packets from a spammer's IP for a few hours would slow them right down.
Fun fun fun...
Due to my recent comment spam issues I've created a new PyBlosxom plugin called spamquestion. It is similar to the existing Magic Word plugin but instead of using just the one question for any comment entry on the blog, it randomly selects a question from a larger set of configured questions. This makes it much harder for spammers to get past the comment form using automated software. Unlike CAPTCHA systems this scheme doesn't disadvantage visually impaired people or those on text based browsers.
The spamquestion plugin can be downloaded from my Code page.
Comments are now re-enabled on my site, with spamquestion enabled. It'll be interesting to see how the scheme holds up. I also plan to install the Akismet plugin as a second line of defense.
My blog was hit by a comment spammer last week. Hundreds of entries were made, interestingly focussing only a few articles (perhaps with a higher Google ranking?). Running without a CAPTCHA system or similar was good while it lasted. Comments are now disabled until I get around to installing a CAPTCHA style plugin.
Lazy web: what anti comment-spam technologies do you find work well for you? Is CAPTCHA the best option we have?
I started using SpamAssassin for my personal email over a month ago. Having seen the complete ineffectiveness of some anti-spam systems I was fairly pessimistic about how effective it would be. Boy was I wrong. Without any tweaks to the default filtering config (except for ensuring that the latest rules are being used) it stops virtually spam hitting my mailbox with zero false-positives so far. I get 20-40 spams a day and 1 or 2 a month get through to my inbox.
My mail volume is comparatively low so I just set Procmail to invoke SpamAssassin for each inbound message. For higher volume situations something like SA's spamd should probably be used. Using Procmail has the nice benefit of being able to direct spam to a separate folder for later persual and deletion.
A cron job is set to run sa-update
ever night to ensure the latest
default checks are being used. This is important; spammers develop new
tricks to bypass anti-spam systems all the time.
Currently I have all suspected spam going to a spam folder. However SA has been so successful that I'm thinking of getting Procmail to automatically delete higher scoring spam and send only the lower scoring spams to the spam folder. Depending on attitudes towards false-positives some might just delete all emails that SA thinks is spam. Personally, I'd rather be a bit cautious. Losing real email scares me.
It's so nice when something works beyond expectation.
I've just made a new release of IMAPClient (0.3). Changes are:
"BODY[HEADER.FIELDS (FROM)]"
). Thanks Brian!IMAPClient can be installed from PyPI using easy_install IMAPClient
or
downloaded from my Code page. As
always, feedback and patches are most welcome
FuzzyFinder is a useful Vim extension that I've discovered recently (nothing to do with Fuzzyman). It has proven to be a great productivity enhancer, especially when dealing with large codebases with many files.
FuzzyFinder provides a mechanism to search through files on disk or Vim
buffers using fuzzy filename matching. When activated it interactively
searches the current directory for files matching the name you entered.
Matching is very loose, so if for example you enter "abc", you'll get a
list of all files matching *a*b*c*
. It sounds strange at first but is
very effective in practice.
Here's a screen shot of FuzzyFinder when first activated. A list of all files in the current directory is displayed.
The arrow keys can be used to make a selection from the list (useful if you can see what you want). If the list is long, start filtering!
This screenshot shows what happens after a few characters have been entered. The list of available choices is filtered to match. Very powerful.
FuzzyFinder can also do recursive matching using the **
wildcard. This
is great for large source code trees.