Julia, the programming language

The hippest way to get your IEEE754 on. Hngh.

March 31, 2015 — July 19, 2022

computers are awful
dataviz
julia
number crunching
premature optimization

Assumed audience:

From level 0 (julia-curious) up to level 2 (how do you overload broadcasting?)

Julia: A JIT-compiled language with emphasis on the affordances for high-performance scientific computation. Which is to say, it is made for people who want to create new algorithms.

I use Julia enough that I have made many notes about it, splitting off new notebooks covering installation, processing arrays, tensors, matrices, debugging, profiling and accelerating, IDEs and workflows, APIs, FFIs and IO, autodiff, plotting, machine learning, and UIs.

1 Why Julia

tl;dr Not a magic bullet, a handy arrow for the quiver.

Some of Julia’s community makes ambitious claims about Julia being the fastest and bestest thing ever.

Unsurprisingly, Julia is no panacea. It is well designed for numerical computation. In my non-rigorous experiments, it seems to do better than other scripting+compilation options, such as cython, on certain tasks. In particular, if you have dynamically defined code in the inner loop, it does well — say, you are doing a Monte Carlo simulation but don’t know the user’s desired density ahead of time. This is more or less what you’d expect from doing compilation as late as possible rather than shipping a compiled library, but it (at least for my use case) involves less messing around with compilation tool-chains, platform libraries, ABIs, makefiles, etc.

Figure 1: Julia on the Pareto-optimal frontier (Nazarathy and Klok 2021)

Julia has its own idiosyncratic frictions. (I would like to supplement that previous image with a learning curve.) The community process can be problematic (see also giving up on Julia). Library support is patchy with less mindshare than Python. It doesn’t run on iOS. In fact, it uses lots of memory, so maybe not ideal for embedded controllers. Although my colleague Rowan assures me he runs serious Julia code on Raspberry Pi at least all the time, so maybe I need to tighten up my algorithms.

That said, the idea of a science-users-first JIT language is timely, and Julia is that. Python, for example, has clunky legacy issues in the numeric code and a patchy API, and is ill-designed for JIT-compilation, despite various projects that attempt to facilitate it (although jax is as good as it can be probably). Matlab is expensive and nasty for non-numerics, not to mention old and crufty, and code seems to constantly break between MATLAB versions at least as often as it does between Julia versions. Lua has a few good science libraries and could likely have filled a similar niche but for reasons that are perhaps sociological as much as technical doesn’t have the hipness or critical mass of Julia. Super hipsters think that Julia is not radical enough and like DEX instead, which is to Julia as Julia is to everything else.

2 Documentation

2.1 Intros

In order of increasing depth

Here’s something that wasn’t obvious to me: What are Symbols?

And here is a neat heuristic:

A Mental Model for Julia: Talking to a Scientist

  • When you’re talking, everything looks general. However, you really mean very specific details determined by context.
  • You can quickly dig deep into a subject, assuming many rules, theories, and terminology.
  • Nothing is hidden: if you ever want to hear about every little detail, you can ask.
  • They will get mad (and throw errors at you) if you begin to be loose with the specific details.

—Chris Rackauckas, Intro to Julia.

2.2 Staying current

Julia is a fast-moving target. One way to catch the hype in the confusingly rapid evolution of the package ecosystem is to check out the package hotness in Julia observer.

3 Typing and dispatch

3.1 Keyword arguments

Keyword arguments exist but do not participate in method dispatch. Keyword arguments are second-class citizens and might make things slow or stupid if you need to specialize your code based on them. So… design your functions around that. This is usually OK with careful thought, but small lapses lead to many irritating helper functions to handle default arguments. There is much idiomatic style to learn for this.

3.2 Traits and dispatch

Dispatching is not always obvious.

I tend to forget an important keyword: Value Types, which are what allows one to choose a method based on the value, rather than type of a thing. Their usage was not obvious (to me) from the manual but explained beautifully by Tim Holy. There are dangers.

Chris Rackaukas explains Holy Traits which are a Julia idiomatic duck typing method, explained in greater depth by Mauro3 as part of his discontinued Trails library. See also Lyndon White.

🏗

4 Pretty printing, formatting

4.1 Strings: care and feeding

There are many formatting libraries, because everyone seems to dislike the built-in option, which eschews functions such as sensible user-defined formatting of floats.

Many alternatives seem to be based on Formatting.jl which apes Python string interpolation and this is IMO a reasonable baseline.

There is a friendly-ish fork, Format.jl which has a slightly different philosophy and provides an alternative string syntax, StringLiterals .

Mustache.jl also gained traction, being a generic templating syntax; it is easy to roll your own formatting from this.

4.2 Rich display

Julia has an elaborate display system for types, as illustrated by Simon Dernisch and Henry Shurkus.

tl;dr To display an object, invoke

display(instance)

Say you defined MyType. To ensure MyType displays sanely, define

show(io, instance::MyType)

e.g.

function show(io::IO, inst::MyType)
    print(io, "MyType(length=$(inst))")
end

Latexify (manual) marks up certain Julia objects nicely for markdown or TeX display.

latexify("x/y") |> print
$\frac{x}{y}$

PrettyTables aims to output ASCII tables, and happens to support LaTeX, HTML and various Markdown flavours.

TexTables.jl is a specialised table renderer for scientific table output with easy interface. It seems a little less active/popular.

More specialised, RegressionTables does statistical tables with baked-in analysis and a side-order of formatting. That feels too tightly coupled to me. Its launch announcement is a tour of the statistical ecosystem.

Matti Pastell’s useful reproducible document rendering system, Weave.jl, supports basic table displaying for Latex/HTML, although they recommend handballing it to Latexify.

Note that actually automatically generating table output is still more tedious than you’d like; for example, we should be using scientific notation. This generally requires writing custom formatters. Leandro Martinez walks us through that, or one can use ft_latex_sn the built-in PrettyTables helper.

5 Approximating and interpolating functions

ApproxFun.jl does Chebychev and Fourier approximations of given functions. This is not, at least primarily, a tool for data analysis, but for solving eigenfunction problems and such like using computational Hilbert space methods for functions which are otherwise difficult. In particular, we must be able to evaluate the target functions at arbitrary points to construct the interpolant, rather than at, say, provided sample points. Useful companion package FastTransforms converts between different basis function representations. Interpolations.jl does arbitrary order spline interpolations of mathematical functions, but also data. This enables some clever tricks, e.g. approximate random sampling of tricky distributions.

6 Units

People sometimes assert that Julia can handle physical units in a syntactically pleasant fashion, but rarely go on to show any evidence. If I think this capability sounds useful to me, it’s not clear how to access it, the documentation assumes I already know. From what I can tell, I first need to actually install Unitful, which includes some useful units, and in particular Quantity types.

Then I can use units in the following fashion:

$ using Unitful
$ const s = u"s"
$ const minute = u"minute"
$ 5.3s + 2.0minute
125.3 s

That is pleasant enough, I suppose?

To know how this works, and also how I can invent my own units, I read Erik Engheim or Lyndon White who explain it in depth.

See SampledSignals for a concrete example of how to do things with units, such as the following method definitions to handle time-to-index conversion.

toindex(buf::SampleBuf, t::Number) = t
toindex(buf::SampleBuf, t::Unitful.Time) = inframes(Int, t, samplerate(buf)) + 1
toindex(buf::SampleBuf, t::Quantity) = throw(Unitful.DimensionError(t, s))

7 Modules and imports

It was not obvious to me how, when creating a package that contains submodules, how to import from the root module.

Suppose the package is called MyPackage, and MySubmodule is inside it.

using ..MyPackage: myfunction

8 Useful macros

jkrumbiegel/Chain.jl: Even more convenient than pipes.

9 Destructuring assignment

a.k.a. splatting. Built in. Blingin’ options are provided by the macro Destructure.jl .

10 Writing macros and reflecting

10.1 Intermediate representation

I constantly search for this, so how ’bout I link it? The manual pages I need are Reflection and Metaprogramming. The latter has the parsed and interpolated representation documentation as used in macros. For inspecting my code to see what the language made of it I use @code_lowered, @code_warntype, @code_llvm depending on how much I wish to trade comprehensibility for precision.

10.2 What is this module?

Took me a while to work out that the current module is called @__MODULE__.

So if I want to look up a function by symbol, for example,

function makewindow(name::Symbol, args...; kwargs...)
    winfunc = getfield(@__MODULE__, name)
    makewindow(winfunc, args...; kwargs...)
end

11 Gotchas, tips

Chris Rackauckas mentions 7 Julia gotchas.

Here are some more.

11.1 Implementing standard interfaces for custom types

The type system is logical, although it’s not obvious if you are used to classical OOP. (Not a criticism.)

You want to implement a standard interface on your type so you can, e.g. iterate over it, which commonly looks like this:

for element in iterable
    # body
end

or equivalently

iter_result = iterate(iterable)
while iter_result !== nothing
    (element, state) = iter_result
    # body
    iter_result = iterate(iterable, state)
end

Here is an example of that: A range iterator which yields every nth element up to some number of elements could look like

julia> struct EveryNth
           n::Int
           start::Int
           length::Int
       end
julia> function Base.iterate(iter::EveryNth, state=(iter.start, 0))
           element, count = state

           if count >= iter.length
               return nothing
           end

           return (element, (element + iter.n, count + 1))
       end
julia> Base.length(iter::EveryNth) = iter.length
julia> Base.eltype(iter::EveryNth) = Int

(If you are lucky you might be able to inherit from AbstractArray.)

It’s weird for me that this requires injecting your methods into another namespace; in this case, Base. That might feel gross, and it does lead to surprising behaviour that this is how things are done, and some mind-bending namespace resolution rules for methods. Importing one package can magically change the behaviour of another. This monkey patch style (called, in Julia argot, “type piracy”) is everywhere in Julia, and is clearly marked when you write a package, but not when you use the package. Anyway it works fine and I can’t imagine how to handle the multiple dispatch thing better, so deal with it.

11.2 Array slicing may copy

You are using a chunk of an existing array and don’t want to copy? Consider using views for slices, they say, which means not using slice notation but rather the view function, or the @views macro. Both these are ugly in different ways so I cross my fingers and hope the compiler can optimise away some of this nonsense.

11.3 Custom containers are scruffy

If you need container types, the idiomatic way to do this is using parametric types and parametric methods and so-called orthogonal design.

The rule is: let the compiler work out the argument types in function definitions, but you should choose the types in variable definitions (i.e. when you are calling said functions)

(Thanks to Chris Rackauckas for clarifying this point for me.)

11.4 Containers of containers need parameterisation

It’s hard to work out what went wrong when you hear Type definition: expected UnionAll, got TypeVar.

12 Parallel Julia

The ClusterManager system provides an OK abstraction for orchestrating a bunch of simultaneously active workers.

13 References

Akbayrak, Bocharov, and de Vries. 2021. Extended Variational Message Passing for Automated Approximate Bayesian Inference.” Entropy.
Boyd, and Vandenberghe. 2021. Julia Companion to Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares.
Cox, van de Laar, and de Vries. 2019. A Factor Graph Approach to Automated Design of Bayesian Signal Processing Algorithms.” International Journal of Approximate Reasoning.
Cusumano-Towner, and Mansinghka. 2018. A Design Proposal for Gen: Probabilistic Programming with Fast Custom Inference via Code Generation.” In Proceedings of the 2Nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages. MAPL 2018.
Fischer, and Saba. 2018. Automatic Full Compilation of Julia Programs and ML Models to Cloud TPUs.” arXiv:1810.09868 [Cs, Stat].
Innes. 2018. Don’t Unroll Adjoint: Differentiating SSA-Form Programs.” arXiv:1810.07951 [Cs].
Lau, Drosos, Markel, et al. 2020. The Design Space of Computational Notebooks: An Analysis of 60 Systems in Academia and Industry.” In 2020 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).
McNicholas, and Tait. 2019. Data Science With Julia.
Nazarathy, and Klok. 2021. Statistics with Julia: fundamentals for data science, machine learning and artificial intelligence. Springer Series in the Data Science.
Pawar, and San. 2019. CFD Julia: A Learning Module Structuring an Introductory Course on Computational Fluid Dynamics.” Fluids.
Rackauckas. 2019a. Neural Jump SDEs (Jump Diffusions) and Neural PDEs.” The Winnower.
———. 2019b. The Essential Tools of Scientific Machine Learning (Scientific ML).”
Reid. 2015. Advanced Analytic 18.305 Methods in Science and Engineering.
Revels, Lubin, and Papamarkou. 2016. Forward-Mode Automatic Differentiation in Julia.” arXiv:1607.07892 [Cs].
van de Laar, Cox, Senoz, et al. 2018. ForneyLab: A Toolbox for Biologically Plausible Free Energy Minimization in Dynamic Neural Models.” In Conference on Complex Systems.
Xu, Kailai, and Darve. 2020. ADCME: Learning Spatially-Varying Physical Fields Using Deep Neural Networks.” In arXiv:2011.11955 [Cs, Math].
Xu, Kai, Ge, Tebbutt, et al. 2019. AdvancedHMC.jl: A Robust, Modular and Efficient Implementation of Advanced HMC Algorithms.”