I gave a talk to economists (i.e., not professional programmers) about a Julia project we were working on. One thing Julia is famous for is its speed. It uses LLVM [1] as back-end. As seeing is believing, here is a small example. This example was chosen as it is small, easy to explain, and easy to program while still showing meaningful time differences.
We have a square \([-1,+1]\times[-1,+1]\) and an inscribing circle with radius \(1\). Their areas are \(4\) and \(\pi\) respectively. The idea is to draw \(n\) points \[\begin{align}& x_i \sim U(-1,+1) \\ & y_i \sim U(-1,+1)\end{align}\]Let \(m\) be the number of points inside the circle, i.e. with \[x_i^2+y_i^2\lt 1\] Obviously \[\frac{m}{n} \approx \frac{\pi}{4}\] It follows that an estimate of \(\pi\) is \[\hat{\pi}=4\frac{m}{n}\]
Here, I look at three different implementations of this simulation:
- A straightforward implementation in Python using a for loop. This is very slow.
- A Python implementation using NumPy [2]. Here, we use arrays instead of explicit loops. Of course, NumPy will still use loops behind the scenes, but that is done inside a more high-performance C library implementation. This formulation gives a big performance boost. Lesson: in Python loops are very expensive.
- This is the same intuitive implementation as in 1., but now using Julia. This is the fastest.
A single simulation uses 1 million points, and we repeat this 10 times.
1. Python loop | Output: |
2. Python/NumPy without loops | Output: |
3. Julia loop | Output: Note the use of \(\in\) and \(4m\). |
When programming in Python, it is not unusual to write two very different implementations: one that works and then one that is fast. An argument for using Julia could be that the first implementation is already very fast.
In Python, an alternative to using NumPy is to use Numba [3]. This also compiles with LLVM.
References
- The LLVM Compiler Infrastructure, https://llvm.org/
- https://numpy.org/
- https://numba.pydata.org/