ArticleML
Zach Flynn
2024-02-24

Table of Contents

  1. Introduction
  2. Writing an Article
  3. Storing Cites in a Bibliography
  4. Writing Math
  5. Custom Styling
  6. Math Vocabulary

Introduction

ArticleML is a markup language for writing... articles. It lets you weave math, text, and tables together. It includes a citation management system and an easy way to add references to link around the doc.

Project goal:

ArticleML wants to make it easy to move research off of printer-focused PDF and to web-focused hypertext.

The basic principles of the project:

ArticleML is a type of XML. For our purposes, XML is a format that's like HTML except you always have to close the tag or write things like <br/> (a "self-closing" tag). There are a lot of other features of XML, but we won't need them.

Writing an Article

This section describes how to write an article in ArticleML. Most of the article is written the same as you would write any article in HTML, but it has some special features to make the researcher's life easier.

ArticleML documents start with the <article> tag. This is the root tag that contains the entire document.

The first tag that will usually appear below that is the <meta> tag. This section defines information about the document. Currently four tags are possible underneath <meta>.


      
               <meta>
               <title>ArticleML</title>
               <author>Zach Flynn</author>
               <date>today</date>
               <institution>Fill in your institution</institution>
               </meta>
    
    

These tags make it easy for another program to find out who wrote an article, and also determine the heading of the article.

If your document has an abstract, you can add it here with the <abstract> tag. You can use any HTML tags you'd like to format text in the abstract. The abstract is treated the same as any other section of the document exist that it is formatted differently and always put at the top of the document no matter where it is declared in the ArticleML file.


      
               <abstract>
               <p>This article is about <b>important</b> topics like how to solve <m>a x^[2] + bx + c = 0</m>.</p>
               </abstract>
    
    

The rest of the article is divided into <section>'s, like this:


      
               <section name="Important Section">
               <p>Stuff....</p>
               </section>
      
    
    

Each section can have a name and a title. If only the name is specified, then it is used as the title. If neither is present, then nothing will indicate it is a new section in the HTML output.

If you use the <ref> tag you can create links to various sections so that in a long document, your readers can easily refer back to other sections. The other advantage of this is that there are direct links to different sections of the article so people discussing your article online can link to the relevant section (the link is the document URL + #section_name).

For example, the table of contents of this documentation is:


      
               <section name="Table of Contents">
               <ol>
               <li><ref>Introduction</ref></li>
               <li><ref>Writing an Article</ref></li>
               <li><ref>Storing Cites in a Bibliography</ref></li>
               <li><ref>Writing Math</ref></li>
               <li><ref>Custom Styling</ref></li>
               <li><ref>Math Vocabulary</ref></li>               
               </ol>
               </section>
    
    

Another thing that will come up on occassion is that you want to insert a literal angle bracket (<>'s). There are three ways to do it:

  1. The <lt/> and <gt> tags will insert < or >, respectively.
  2. The <tag> tag will wrap its content in angle brackets, like: <tag>hello</tag> = <hello>.
  3. You can insert angle brackets inline using the <m> math environment (discussed more in Writing Math) with: <m>lt</m> or <m>gt</m>.

Storing Cites in a Bibliography

ArticleML can maintain bibliographic information in a convenient way and handle inline citations.

Alongside the article file type described above, ArticleML can process bibliography files. These files have <bibliography> as their root tag.

Below the root level tag, there are a number of <entry> tags with name attributes indicating how you want to refer to the citation in the text.

For an example, see the below bibliography file:


      
               <bibliography>
               <entry name="Flynn2024">
               <author>Zach Flynn</author>
               <title>My Paper</title>
               <publication>A Very Famous Journal</publication>
               <volume>38</volume>
               <pages>1-23</pages>
               <year>2024</year>
               </entry>
               </bibliography>
    
    

Then, within the text, you can reference the bibliography entry using the <cite> tag like this:


      In this paper, I add to the brilliant work with earth-shattering impact by <cite>Flynn2024</cite>.
    
    

The <cite> tag inserts an inline citation that looks like this: Zach Flynn (2024) and links to the references section added at the end of the article. Speaking of which...

To use a bibliography file in your article, you need to reference it in a <bibliography> block in the article file. The bibliography entry has a file attribute giving the path to the bibliography file and the content within the tags tell ArticleML how to format the bibiliography entries.

For example:


      
               <bibliography file="bibliography.xml">
               <source-author/> (<source-year/>). <u><source-title/></u>. <i><source-publication/> <source-volume/></i>, <source-pages/>.
               </bibliography>
    
    

Within the <bibliography> section, tags with the format <source-ATTRIBUTE> evaluate to the relevant ATTRIBUTE in the bibliography file. ArticleML lets you specify any ATTRIBUTE in the bibliography file, i.e. you can have your own schema for bibilography entries.

Writing Math

ArticleML uses a simple language for writing mathematics geared towards being easy to use in an XML document. Essentially, you write math in roughly the same way as you would write it in an email. It compiles to MathML, the standard math language for the web. Because MathML is valid XML, you could just write your math using MathML, but it is not very convenient to write and is difficult to read at a glance. ArticleML's <m> sublanguage looks more like normal mathematics.

MathML might be useful though if you end up needing to use some advanced formatting, see: Mozilla's resources for more info.

To get a sense of what the <m> syntax looks like: @sum[i=1; n] i = @frac[n(n+1); 2] renders to:

Σi=1ni=n(n+1)2

Math is written between <m> tags, like for the above:


      
               <m style="display">
               @sum[i=1;n] i = @frac[n(n+1); 2]
               </m>
    
    

All of the function-like expressions in <m> start with an @ symbol to differentiate them from other kinds of symbols like the greek characters which do not need any kind of marker. For example, sigma renders as: σ.

To render any string as text even if it contains otherwise meaningful syntax for <m>, use quotation marks, i.e.: "sigma" renders as: sigma

You can do subscripts and superscripts like this (brackets required): x_[i]^[j]. The output is:

xij

The style attribute can currently be either "display" or "inline". "display" formats the math centered on the page in a separate paragraph (like above). "inline" displays the text within the current paragraph. If you do not specify the style attribute, the "inline" option is assumed like so:


      The value of <m>theta</m> is 2.
    
    

Which gets rendered as: The value of θ is 2.

Use double parenthesis to write matrices in <m>:


        [[1;2;3];[4;5;6]]
      

Which outputs the following:

[123456]

Change the parenthesis style like this:


        ([1;2;3];[4;5;6])
      

Which outputs the following:

(123456)

You can mix math and text in the natural way: @under["max"; x] space u(x) quad "st:" quad x ge 0, which renders as:

maxx u(x)  st:  x0

Generally, you should quote long strings in <m>. Both because it will format better and because substrings would otherwise get picked up as symbols. For example, quadratic without quotes renders as:   ratic because "quad" is a spacing operator. Writing "quadratic" gives probably what you want: quadratic.

Lastly, if you want to pass something into the MathML output exactly as written without formatting, use backticks (`) to delineate the text, i.e.: @color[`red`; X]. If the "red" weren't backticked here, then it would get formatted as an identifier which isn't what the color function is expecting. See the output below:

X

To insert a literal [ or ] value, use lbracket or rbracket.

And that's really it as far as the grammar and syntax of the math language. For the full vocabulary of the langauge, see the tables under Math Vocabulary.

Theorems, Assumptions, Definitions

You can decare theorems, assumptions, or definitions by doing the following:


      <article>
      <theorem name="My Theorem">
      <statement>
      This is my statement of the theorem.
      </statement>
      <discussion>
      Proof. This is the proof of the theorem.
      </discussion>
      </theorem>
    

Assumptions and Definitions are specified in the same way except they use the <assumption> and <definition> tags. Both the <statement> and <discussion> section are optional.

To insert a theorem, assumption, or definition within a <section>, use:


      <theorem statement="yes">Theorem Name</theorem> (inserts the theorem label and statement)
      <theorem discussion="yes">Theorem Name</theorem> (inserts the theorem label and the discussion)
      <theorem discussion="yes" statement="yes">Theorem Name</theorem> (inserts the theorem label and both the statement and the discussion)
      <theorem refdisc="yes">Theorem Name</theorem> (inserts a link to a discussion-only insert of theorem, use for linking to a proof you put in a later section, for example).
      <theorem>Theorem Name</theorem> (inserts a link to a statement of the theorem).
    

You can use any code that you would use in writing sections within the <statement> and <discussion> tags.

Custom Styling

To use custom CSS for the article, include a <style> tag below <article>. For example, the entry for the documentation is:


    <style>
    table {
    border: 0.1em solid black;
    width:100%;
    text-align: center;
    }
    th,td {
    border-bottom: 0.1em solid black;
    }
    body {
    background-color: #eee;
    }
    #main-content {
    background-color: white;
    padding:2em;
    }
    .row {
    display: flex;
    }

    .column {
    flex: 50%;
    }    
    </style>
    

There are a few classes and ID's you can hook into. The content of the page is in a div with id = "main-content". All sections have their "id" set to their "name" attribute and their class is "normal-section". The abstract has class and id both set to "abstract". All math is between <math> tags. Title, author, institution, and date have id's set to "title", "author", "institution", and "date". Bibliography entries have ID's equal to their name and the bibliography section has class "bibliography".

And that's all there is to styling. In the future, ArticleML may add some convenience options for quickly accessing common style changes.

Math Vocabulary

Symbols
<m> code Output
uparrow
downarrow
leftarrow
rightarrow
harrow
thickrightarrow
thickleftarrow
thickharrow
mapstoarrow
toarrow
tailtwoheadrightarrow
tailrightarrow
twoheadrightarrow
leftrightarrow
thickleftrightarrow
neg ¬
implies
iff
forall
exists
bottom
top
vdash
models
int
oint
partialder
nabla
pm ±
emptyset
infty
aleph
ldots
therefore
triangle
angle
prime
space " "
frown
quad "  "
qquad "    "
cdots
vdots
ddots
diamond
square
lfloor
rfloor
lceiling
rceiling
qed
CC
NN
QQ
RR
ZZ
( (
) )
{ {
} }
lbracket [
rbracket ]
sc ;
Operators
<m> code Output
+ +
/ /
! !
-
*
cdot
bowtie
ltimes
times ×
circ
oplus
otimes
odot
wedge
vee
cap
cup
sum Σ
prod Π
bigwedge
bigvee
bigcap
bigcup
Functions
Unary
<m> code Output
@sqrt[X] X
@sum[X] ΣX
@op[X] X
@text[X] X
@hat[X] X^
@overline[X] X¯
@underline[X] X_
@vec[X] X
@dot[X] X.
@ddot[X] X..
@bb[X] X
@bbb[X] X
@cc[X] X
@tt[X] X
@fr[X] X
@sf[X] X
@abs[X] |X|
Greek
<m> code Output
alpha α
beta β
chi χ
delta δ
Delta Δ
epsilon ε
epsi ε
varepsilon ε
eta η
gamma γ
Gamma Γ
iota ι
kappa κ
lambda λ
Lambda Λ
mu μ
nu ν
omega ω
Omega Ω
phi φ
varphi φ
Phi Φ
pi π
Pi Π
psi ψ
Psi Ψ
rho ρ
sigma σ
Sigma Σ
tau τ
theta θ
vartheta ϑ
Theta Θ
upsilon υ
xi ξ
Xi Ξ
zeta ζ
Relations
<m> code Output
= =
!=
:=
preceq
prec
succeq
succ
le
lt <
ge
gt >
rtimes
gt >
in
notin
subseteq
subset
supseteq
supset
equiv
cong
approx
propto
nsube
nsub
nsupe
nsup
ni
nni
Functions
Binary
<m> code Output
@frac[X;Y] XY
@sum[X;Y] ΣXY
@root[X;Y] XY
@color[`red`;Y] Y
@super[X;Y] XY
@sub[X;Y] XY
@over[X;Y] XY
@under[X;Y] XY
Functions
Terinary
<m> code Output
@underover[X;Y;Z] XYZ