General Programming In Julia Language From An Advanced Standpoint¶
Julia is a multi-paradigm language which is purely dynamic but supports optional typing.
As the introduction of some basic constructs like For-Loop
, While-Loop
and If-Else
and Function
akin to
those of Python, R or MATLAB, Julia becomes simple and concise, and allows you to work fluently with a fraction of its language
features.
Although People seldom focus on Julia’s language features other than its high performance(until April 2019), I’d show my point of view of the general programming in Julia.
First-Class¶
Almost all language constructs in Julia are first-class, upholding more flexible and advanced program composition.
There’re quite a lot of people that has a prejudice against Python for the absense of multi-line lambdas. Although it’s pretty trivial and straightforward to compile multi-line lambdas into normal Python code objects, the frontend, more concretely the mandatory indentation, comfines Python to the expressive power as a result of that statements are rigidly separate from expressions(statements cannot occur in expressions), and then first-class gets missed. There’re other manners to achieve indentation without lossing expression-first like that of ML family and Haskell, but not adopted by Python community yet.
Julia seems to be over-first-class, for all constructs there are expressions and, syntactically there’s no restriction to compose arbitrary constructs. Definitions and computations are distinct from each other in major functional languages, but each of both consists of the other.
[let y=f(x); (x, y) end for x in points]
map(
function map_fn(x)
# do stuffs
end,
points
)
let bind_maybe(a, f) -> a == nothing ? f(a) : nothing,
(|>) = bind_maybe
init |> a ->
do1_stuffs(a) |> a ->
... |> a ->
... |> a ->
...
end
The reason why composability is crucial has been discussed often years on, so I won’t make an another about this topic.
For Julia, first-class also helps when it comes to processing programs as data, which will be unfolded latter in this article.
Polymorphisms of Multiple Dispatch¶
Julia supports polymorphisms to achieve necessary abstractions and reduce self-repeating, though it does stand apart from those pooular industrial languages.
Multiple dispatch looks like parametric polymorphism, but actually it’s more than the latter.
function usum(x :: Vector{T}) where T
s = zero(T)
for each in x
s += each
end
s
end
Overloading is supported as well.
uzero(_ :: Type{Int}) = 0
uzero( :: Type{Float64}) = 0.0
However, multiple dispatch is more than what I listed above. In fact, type is exclusively a specialized instance of immutable data that could be taken advantage of dispatching, while you can make dispatches via immutable data no matter whether it is a type or others.
struct Const{T}
end
flip(::Type{Const{:even}}) = Const{:odd}
flip(::Type{Const{:odd}}) = Const{:even}
f(::Type{Const{0}}) = Const{:even}
f(::Type{Const{1}}) = Const{:odd}
f(::Type{Const{N}}) where N = f(Const{N-1}) |> flip
With above codes, we can statically compute parity of numbers, just as expected.
julia> @code_warntype f(Const{2})
Body::Type{Const{:even}}
1 ─ return Const{:even}
Note that when multiple dispatch fails at static inferences, it’ll behave as dynamic dispatch like Python’s.
Full-Featured Macros¶
Macro is one of the quite few manners to achieve code reuse, also the reason of why some programmers can be thousands of times more efficient than others.
julia> macro gen_var(n :: Int, f)
defs = [Expr(:(=), Symbol("var", i), :($f($i))) for i in 1:n]
esc(Expr(:block, defs..., nothing))
end
@gen_var (macro with 1 method)
julia> f(x) = x * 10 - 2
f (generic function with 1 method)
julia> @gen_var f (x * 10 - 2)
julia> var1
8
julia> var2
18
julia> var3
28
Macro, the Function from AST to AST¶
Once you know macros are functions from ASTs to ASTs, there’s no mystery in Julia macros.
macro f(x)
println(x)
:($x + 1)
end
@assert (@f 1) == 2
Above snippet shows a vivid example of Julia macros. Firstly macro
keyword leads a definition of
macro transformation rule, and @f
marks a callsite of corresonding macro.
You might ask why (@f 1) == 2
, for the return of macro @f
is supposed to be an AST, it seems
a bit magic that it equals to an integer 2
.
Pay attention to the expression @assert (@f 1) == 2
. As the macro invocations are processed recursively
from the inside out, we should firstly process @f 1
.
(function f(x)
println(x)
:($x + 1) => :(1 + 1)
end)(1)
Above step also writes stdio, when executing the AST to AST function f
, a.k.a macro @f
.
Next, as we has already got the output, an AST :(1 + 1)
, imagine that we displace @f 1
by it in the preceding codes,
which produces @assert $(:(1 + 1)) == 2
, simplify it, we’ll get @assert (1 + 1) == 2
.
You might ask why not @assert :(1 + 1) == 2
, good question, let’s dig into it.
Think that if what you return from a macro invocation is always a runtime AST, it will not be transformed into codes to compile, so that such a macro becomes useless at all.
However, if we “unquote” the macro return
Quoted | Unquoted |
---|---|
:(:(1 + 1)) |
:(1 + 1) |
:(1 + 1) |
1 + 1 |
quote 1 + 1 end |
1 + 1 |
quote $x + x end |
<x> + x , where <x> stands for some computated expression. |
1 |
1 |
[1, 2, 3] |
[1, 2, 3] |
Above table unveils the rules of AST interpolations, and obviously there’s a law that
if we say an expression is quoted N
times, it’ll be interpolated as an expression
quoted max(0, N - 1)
times.
Scope and Hygiene¶
The scoping rules of macros are simple enough when you are under the point of view that macros are functions from ASTs to ASTs.
julia> module A
var = 0
macro ma()
quote
var
end
end
end
julia> var = 5555
julia> A.@ma
0
julia> using .A: @ma
julia> @ma
0
The first I’d present here is, the expression a macro return is evaluated by the module where the macro’s defined.
When a macro is expanding inside the local scope of a function, a concept called hygiene comes up naturally.
macro assign_y(x)
:(y = $x)
end
function f(x)
@assign_y x
y
end
f(1)
You might expect it works, but unfortunately it won’t, and solely feed you with
ERROR: UndefVarError: x not defined
Stacktrace:
[1] f(::Int64) at ./REPL[6]:2
[2] top-level scope at none:0
The reason why for this is, AST interpolations will be always preprocessed to make sure all
bare symbols(not boxed in QuoteNode or deeper quotation) are transformed into mangled names(a.k.a, gensym) that looks
a bit weird like ##a#168
. Also, the reason why Julia does this is, to by default avoid generate new symbols visible
in current local context.
Just think about you want a macro to log the value just calculated:
macro with_logging(expr)
quote
a = $expr
@info :logging a
a
end
end
We didn’t transform the symbol a
into something like ##a#167
, what if
you had already define a
in your codes?
function x5(a) # take it as multiply 5
x2 = @with_logging 2a
x3 = @with_logging 3a
x2 + x3
end
my_func(1)
You can see that if macro with_logging
didn’t transform a
written in macro body,
you’ll get x5(1) == 8
instead of x5(1) == 5
.
That’s it, and we call this sort of macro the Hygienic Macros.
But there does have some context-sensitive cases for code generation, that you want to share the context of multiple generator functions. A impressive example is my MLStyle, which I’m extremely proud of for it has reached a high performant pattern matching compilation with the extensibility(customizable) I’ve dreamed about since I started programming. In this package, when compiling patterns, I’d use manual gensym instead of depend on hygienic macros, for I might access specific symbols from separate functions that’ll generate some codes.
In this case that people want to generate symbols that will contaminate scopes, Julia provides an escape mechanism to avoid gensym.
macro assign_y(x)
esc(:(y = $x))
end
function f(x)
@assign_y x
y
end
f(1) # 1
The previous code finally works after supplementing a esc
invocation on returned AST.
Other Useful Knowledge for Julia Macros¶
@__MODULE__
gets you current module.- When you want to control which module to evaluate a given AST, you can use
moduleX.eval(expr)
or@eval moduleX expr
. - Although we already know macros are functions, something need to be stressed is,
there’re 2 implicit arguments of a macro:
__module__
and__source__
.__module
is the module you invoke the macro in,__source__
is the line number node that denotes the number of the line you invoke the macro.
A Big Step Forward in AST Manipulations¶
Julia does a lot on ASTs, e.g., analysis, substitution, rewriting, and so on.
As we’ve introduced the laws of AST interpolations, you might know that we can generate ASTs like following codes instead of in purely constructive manner.
ex0 = [1, 2, 3]
ex1 = :[1, 2, 3]
ex2 = :($ex0 + 1)
# :([1, 2, 3] + 1), [1, 2, 3] here is already evaluated before interpolation.
ex3 = :($ex1 + 1)
# :([1, 2, 3] + 1)
That’s not that cool for there’re also other approximate supports in other languages in current stage.
The advance is at the deconstructing of ASTs, which extremely impacts the way we could analyse ASTs.
Think about a case that you’d like to collect positional arguments and keyword arguments from some function callsites.
get_arg_info(:(f(a, b, c = 1; b = 2))) # => ([:a, :b, :(c = 1)], [:(b = 2)])
get_arg_info(:(f(args...; kwargs...))) # => ([:(args...)], [:(kwargs...)])
get_arg_info(:(f(a, b, c))) # => ([:a, :b, :c], [])
How will you achieve this task?
Attention! No matter how you’ll deal with it, think about whether you need to
get a prerequisite about Julia AST structures? Say, you have to know Expr
(a.k.a
one of the most important Julia AST types) has 2 fields, head
and args
,
or you have to understand the structure of a.b
is
Expr
head: Symbol .
args: Array{Any}((2,))
1: Symbol a
2: QuoteNode
value: Symbol b
instead of
Expr
head: Symbol .
args: Array{Any}((2,))
1: Symbol a
2: Symbol b
, or you have to make it clear that in vector literals, there’re
Julia code | AST structure |
---|---|
[1 2 3] |
Expr
head: Symbol hcat
args: Array{Any}((3,))
1: Int64 1
2: Int64 2
3: Int64 3
|
[1, 2, 3] |
Expr
head: Symbol vect
args: Array{Any}((3,))
1: Int64 1
2: Int64 2
3: Int64 3
|
[1; 2; 3] |
Expr
head: Symbol vcat
args: Array{Any}((3,))
1: Int64 1
2: Int64 2
3: Int64 3
|
[1 2; 3 4] or [1 2
3 4]
|
Expr
head: Symbol vcat
args: Array{Any}((2,))
1: Expr
head: Symbol row
args: Array{Any}((2,))
1: Int64 1
2: Int64 2
2: Expr
head: Symbol row
args: Array{Any}((2,))
1: Int64 3
2: Int64 4
|
To be honest, there’re so many detailed rules about the strcutrue, but is it really necessary to know them all if you’re planning to work with Julia ASTs(including analysis about their strcutrues)?
No! Absolutely no! Although I know many of you older Julia guys are always checking and decomposing ASTs with your vast knowledge about Julia internals , I’d suggest you sincerely to start using MLStyle.jl, for the sake of your efficiency.
At here, I’d introduce MLStyle’s AST Manipulations to you via giving some impressive examples. For sure this package will be displaced by some better alternative one day, but the underlying methodology wouldn’t change at all. Don’t open the link immediately now, just follow my introduction now, you don’t need to care much about a specific package or framework, but its core idea.
A tremendous inspiration occurred to me on one day in the last year(2018) that, what if we can deconstruct ASTs just as how they’re constructed.
In fact, You don’t have to know accurately about all AST structures when you start using corresonding syntaxes, like you just write
a = [1, 2, 3]
b = [1 2 3]
c = [1; 2; 3]
d = [1 2; 3 4]
Does you have to know what will the their AST look like? No.
A classmate of mine who knows only mathematics and has never got an experience in programming before can still write such codes fluently to complish his linear algebra homeworks, but he does feel annoyed when I try to explain the concepts of ASTs and how the ASTs he just written would look like.
You might have notice the importance of using syntactic components, yes, it’ll simply makes progress in the history we manipulate programs as data.
Pattern matching is an essential infrastructure in modern functional languages, which reduces the complexity of almost all logics via deconstructing data as how data is constructed.
Okay, this sentence occurred twice now:
deconstructing data as how data is constructed.
Remember it, and it’s our principle in this section.
Let’s think about how ASTs are constructed?
Firstly, we can write raw ASTs, write them literally.
ex = :(a + 1)
ex = :[1 2 3]
Next, there are syntactic AST interpolations.
ex = :[1, 2, 3]
ex = :($ex + 1) # :([1, 2, 3] + 1)
That’s enough. Now, let’s introduce a @match
. This syntax may be
deprecated in the better alternative in your time, but you must be able to simply
make an equivalence or superior via the better one with the new start-of-the-art pattern matching package
at the time you’re reading this article.
@match value begin
pattern1 => value1
pattern2 => value2
end
To support match literal ASTs, we must get a true
with following codes,
# rmlines is a function that removes all line number nodes in an AST.
# you can fetch it here : https://github.com/thautwarm/MLStyle.jl#preview
@match rmlines(:(let x = 1; x end)) begin
:(let x = 1; x end) => true
_ => false
end
Think about the principle we’ve figured out, okay, I’d stress it again here as I’m a shabby repeater:
deconstructing data as how data is constructed.
Then
v = :[1, 2, 3]
ex = :($v + 1)
@match ex begin
:($v + 1) => v == :[1, 2, 3]
_ => false
end
Oooh! Do you understand it? Does it make sense in your opinion?
AST interpolation corresponds to constructing, while AST interpolations occur in pattern it’s regarded as deconstructing. We can call it “AST extractions”.
Now, let’s turn back to the original question, to implement get_arg_info
mentioned previously.
We should at first introduce some examples about constructing in the case of get_arg_info
.
If we want to pass arguments to f(a, b; c, d)
, we can use
f(a, b; c, d) = a + b + c + d
args = [1, 2]
kwargs = Dict(:c => 1, :d => 2)
f(args... ; kwargs...)
, which produces a result 6
.
Notice about the form f(args...; kwargs...)
, it might indicates that
in AST level, positional arguments and keyword arguments are stored in
arrays, respectively.
Let’s have a try:
args = [:a, :b]
:(f($(args...)))
And you’ll get an output exactly as
:(f(a, b))
Good job, now we use MLStyle’s @match
, following the rule
deconstructing data as how data is constructed as well.
args = [:a, :b]
@match :(f($(args...))) begin
:(f($(args...))) => args == [:a, :b]
_ => false
end
Then you get a true
as output.
Think a while, and check the final implementation of get_arg_info
:
get_arg_info(ex) = @match ex begin
:($name($(args...); $(kwargs...))) ||
:($name($(args...))) && Do(kwargs = []) => (args, kwargs)
_ => throw("invalid input")
end
||
denotes the so-called Or-Pattern.
Limitation: Absence of Function Types¶
Julia is an ideal language for quite many domains but, not for all.
For people who’re used to functional programming languages, especially for the groups that tilts the advanced type-based polymorphisms(type classes’ instance resolution, implicit type variables, higher-kinded-polymorphisms, etc.), there’s an essential necessity of the dedicated function type.
On and off, I’ve attempted a lot with my friends to emulate those advanced type-based polymorphisms in Julia, but finally we noticed that without implicit inferences on functions, only dynamic typing and multiple dispatch are far from being sufficient.
In Julia, each function has its own type which is a subtype of Function
, which prevents
making abstractions for functions from common behaviours in type level. The worse is, these
abstractions on functions in type level have been proven pervasive and fairly useful by academic
world for about 10000 year, and perform a role like arithmetic operation in our educations.
In Haskell, the type signature of a function does help in semantics side. Following Haskell code allows users to automatically generate tests for a given type/domain by taking advantage of properties/traits of the type/domain.
import Control.Arrow
import Data.Kind
newtype MkTest (c :: * -> Constraint) a = MkTest {runTest :: a}
class TestCase (c :: * -> Constraint) a where
samples :: c a => MkTest c [a]
testWith :: c a => (a -> Bool) -> MkTest c [(a, Bool)]
testWith logic =
MkTest $ map (id &&& logic) seq
where
seq :: [a]
seq = runTest (samples :: MkTest c [a])
type TestOn c a = c a => (a -> Bool) -> MkTest c [(a, Bool)]
Now I’m to illustrate how Haskell achieves a perfectly extensible and reasonable automatic test generator, through following instances(terminology in Haskell, other than a synonym of “example”), using function types to achieve polymorphisms that absolutely Julia cannot make so far(Juliav1.1).
instance TestCase Enum a where
samples = MkTest . enumFrom . toEnum $ 0
instance TestCase Bounded a where
samples = MkTest [maxBound, minBound]
We has now made instances for TestCase
on the constraints(a.k.a Type Classes) Enum
and Bounded
.
For the readers who’re not that familiar to Haskell, you could take constraints in Haskell as traits(in Rust) or loose-coupled interfaces.
Once a type is under constraint Enum
, you can enumerate its values, plus
Bounded
is a constraint capable of making sure that the maximum and minimum are
available(via maxBound
and minBound
). instance TestCase Enum a
denotes for all concrete type a
, make the constraint TestCase on
constraint Enum
and type a
. Yes, TestCase
is also a constraint,
a constraint among other constraints and types.
So far our test generator has been already finished, a bit too fast, right? That’s how Haskell matters: pragmatic, productive.
We can then make tests with above codes, taking advantage of properties/traits of our data types:
onEnumerable :: TestOn Enum a
onEnumerable logic = testWith logic
intTest :: Int -> Int
intTest x = x ^ 2 + 4 * x + 4 == (x - 2)^2
boolTest :: Bool -> Bool
boolTest x = True
main = do
putStrLn . show . take 10 . runTest $ onEnumerable intTest
putStrLn . show . runTest $ onEnumerable boolTest
return ()
which outputs
[(0,True),(1,True),(2,True),(3,True),(4,True),(5,True),(6,True),(7,True),(8,True),(9,True)]
[(False, True), (True, True)]
Take care that the I did nothing to generate test sets, but passing off the function intTest
and boolTest to specify the test logics. I solely said that I want to test data
types on their enumerable traits(onEnumerable
), then the polymorphic automatic test generator
took advantage of functions’ type signatures(intTest: Int -> Bool, boolTest: Bool -> Bool), and
all tasks finished.
Turn back to Julia side, although Haskell does a lot implcits, multiple dispatch can often emulate them successfully(without strongly typed and static checking though). The problem is at the absense of function types, as we cannot take advantage of their type information to engage dispatching.
Some tentative but incomplete workaround could be made through following idea:
import Base: convert
struct Fn{Arg, Ret, JlFuncType}
f :: JlFuncType
end
@inline Fn{Arg, Ret}(f :: JlFuncType) where {Arg, Ret, JlFuncType} = Fn{Arg, Ret, JlFuncType}(f)
@generated function (f :: Fn{Arg, Ret, JlFuncType})(a :: Arg) :: Ret where {Arg, Ret, JlFuncType}
quote
$(Expr(:meta, :inline))
f.f(a)
end
end
convert(Fn{Arg, Ret, JlFuncType}, f :: JlFuncType) where {Arg, Ret, JlFuncType} = Fn{Arg, Ret, JlFuncType}(f)
This is a considerably efficient function type implementation according to FunctionWrappers.jl, However, the problem is that its usage is quite unfriendly for peope have to manually annotate disturbingly much.
To address the polymorphism problems, the major methods(type-based) from current academic world won’t work in Julia, and you should pave the way for a LISP-flavored “polymorphism”, in other words, use macros frequently.