Skip to content

feat(latex): combination and rendering of multi-line latex formulas. #520

@LexeyKhom

Description

@LexeyKhom

Problem

Previously, each formula was rendered as a separate virtual text block, which
led to visual displacement and clutter, especially when multiple formulas were
on one line.

This change combines the top and bottom virtual lines of all inline formulas on
a single physical line into two unified virtual text blocks (one above and one
below the main text line). The baseline of each formula is rendered inline,
preserving the text flow.

Approach to Solution

To solve this problem, an option was chosen that separates the logic of
combining and rendering.

  • Function Handler:combine first processes all formulas on the line: converts
    them, calculates their display properties, and returns structured data,
    including a baseline list ({node, text}) and combined top and bottom
    virtual lines (string[]).
  • Handler:run calls combine to get the data, then iterates through this data
    to render the baselines, and finally renders the combined top and bottom
    virtual lines.

Code

Helper code for list.reverse.

lib/iter.lua:

---@generic T
---@param values T[]
---@return T[]
function M.list.reverse(values)
    local n = #values
    for i = 1, math.floor(n / 2) do
        values[i], values[n - i + 1] = values[n - i + 1], values[i]
    end
    return values
end

Main code.

handler/latex.lua:

local Context = require('render-markdown.request.context')
local Indent = require('render-markdown.lib.indent')
local Marks = require('render-markdown.lib.marks')
local Node = require('render-markdown.lib.node')
local iter = require('render-markdown.lib.iter')
local log = require('render-markdown.core.log')
local str = require('render-markdown.lib.str')

---@private
---@class LatexBuffer
---@field private context render.md.request.Context
local LatexBuffer = {}
LatexBuffer.__index = LatexBuffer

---@param context render.md.request.Context
function LatexBuffer.new(context)
    local self = setmetatable({}, LatexBuffer)
    --!NOTE: We must use the `context`, not a global or local variable, because If a
    -- file changes during rendering, then the formulas from the previous one will
    -- fall into the new file (as I think)
    self.context = context
    self.context.latex_buffer = self.context.latex_buffer or {}
    return self
end

---@param node render.md.Node
function LatexBuffer:add(node)
    table.insert(self.context.latex_buffer, node)
end

function LatexBuffer:is_empty()
    return #self.context.latex_buffer == 0
end

---@return integer
function LatexBuffer:row()
    return self:is_empty() and -1 or self.context.latex_buffer[1].start_row
end

---@return render.md.Node[]
function LatexBuffer:flash()
    local temp_nodes = self.context.latex_buffer
    self.context.latex_buffer = {}
    return temp_nodes
end

---@class render.md.handler.buf.Latex
---@field private context render.md.request.Context
---@field private config render.md.latex.Config
---@field private latex_buffer LatexBuffer
local Handler = {}
Handler.__index = Handler

---@private
---@type table<string, string>
Handler.cache = {}

---@param buf integer
---@return render.md.handler.buf.Latex
function Handler.new(buf)
    local self = setmetatable({}, Handler)
    self.context = Context.get(buf)
    self.config = self.context.config.latex
    self.latex_buffer = LatexBuffer.new(self.context)
    return self
end

---@param root TSNode
---@return render.md.Mark[]
function Handler:run(root)
    if not self.config.enabled then
        return {}
    end
    if vim.fn.executable(self.config.converter) ~= 1 then
        log.add('debug', 'ConverterNotFound', self.config.converter)
        return {}
    end

    local node = Node.new(self.context.buf, root)
    log.node('latex', node)

    --!WARN: In the current implementation, the LAST LINE is not rendered!
    -- It is necessary to add the `after` function or ts_node counter so that
    -- you can understand that this is the last node.
    local same_row = self.latex_buffer:is_empty()
        or node.start_row == self.latex_buffer:row()
    if same_row then
        self.latex_buffer:add(node)
        return {}
    end

    local marks = Marks.new(self.context, true)
    local nodes = self.latex_buffer:flash()

    -- NOTE: variant 1
    local baselines, top_lines, bot_lines = self:combine(nodes)
    for _, baseline in pairs(baselines) do
        self:render_baseline(marks, baseline.node, baseline.text)
    end
    self:render_virtual_lines(marks, nodes[1], top_lines, true)
    self:render_virtual_lines(marks, nodes[1], bot_lines, false)

    -- NOTE: variant 2
    -- self:combine_and_render_nodes(marks, nodes)

    self.latex_buffer:add(node)
    return marks:get()
end

---@param marks render.md.Marks
---@param nodes render.md.Node[]
function Handler:combine_and_render_nodes(marks, nodes)
    table.sort(nodes, function(a, b)
        return a.start_col < b.start_col
    end)

    local combined_top_lines = {}
    local combined_bot_lines = {}
    local formulas_width = 0
    for _, node in ipairs(nodes) do
        local lines = str.split(self:convert(node.text), '\n', true)

        local baseline_index = self:get_baseline_index(lines)
        local max_width = vim.fn.max(iter.list.map(lines, str.width))
        local display_start_pos = self:get_display_start_pos(node)
            + formulas_width

        local baseline =
            vim.list_slice(lines, baseline_index, baseline_index)[1]
        local suffix = str.pad(max_width - str.width(baseline))
        self:render_baseline(marks, node, baseline .. suffix)

        local top_lines = vim.list_slice(lines, 1, baseline_index - 1)
        local bot_lines = vim.list_slice(lines, baseline_index + 1, #lines)
        iter.list.reverse(top_lines)
        self:combine_lines(
            combined_top_lines,
            top_lines,
            display_start_pos,
            max_width
        )
        self:combine_lines(
            combined_bot_lines,
            bot_lines,
            display_start_pos,
            max_width
        )

        formulas_width = formulas_width + max_width
    end

    iter.list.reverse(combined_top_lines)
    self:render_virtual_lines(marks, nodes[1], combined_top_lines, true)
    self:render_virtual_lines(marks, nodes[1], combined_bot_lines, false)
end

---@param nodes render.md.Node
---@return {node: render.md.Node, text: string}[] baselines, string[] top_lines, string[] bot_lines
function Handler:combine(nodes)
    local baselines_tuples = {} ---@type {node: render.md.Node, text: string}[]
    local combined_top_lines = {} ---@type string[]
    local combined_bot_lines = {} ---@type string[]

    table.sort(nodes, function(a, b)
        return a.start_col < b.start_col
    end)

    local concealed_formulas_width = 0
    local visible_formulas_width = 0
    for _, node in ipairs(nodes) do
        local lines = str.split(self:convert(node.text), '\n', true)

        local baseline_index = self:get_baseline_index(lines)
        local formula_width = vim.fn.max(iter.list.map(lines, str.width))
        local display_start_pos = self:get_display_start_pos(node)
            - concealed_formulas_width
            + visible_formulas_width

        local baseline =
            vim.list_slice(lines, baseline_index, baseline_index)[1]
        local suffix = str.pad(formula_width - str.width(baseline))
        table.insert(
            baselines_tuples,
            { node = node, text = baseline .. suffix }
        )

        local top_lines = vim.list_slice(lines, 1, baseline_index - 1)
        local bot_lines = vim.list_slice(lines, baseline_index + 1, #lines)

        iter.list.reverse(top_lines)
        self:combine_lines(
            combined_top_lines,
            top_lines,
            display_start_pos,
            formula_width
        )
        self:combine_lines(
            combined_bot_lines,
            bot_lines,
            display_start_pos,
            formula_width
        )

        concealed_formulas_width = concealed_formulas_width
            + str.width(node.text)
        visible_formulas_width = visible_formulas_width + formula_width
    end
    return baselines_tuples,
        iter.list.reverse(combined_top_lines),
        combined_bot_lines
end

---@private
---@param lines string[]
---@return integer
function Handler:get_baseline_index(lines)
    if self.config.position == 'above' then
        return #lines
    elseif self.config.position == 'below' then
        return 1
    elseif self.config.position == 'center' then
        return math.floor(#lines / 2) + 1
    end
    return math.floor(#lines / 2) + 1
end

---@param node render.md.Node
---@return integer
function Handler:get_display_start_pos(node)
    local _, first = node:line('first', 0)
    local temp_concealed_node = {
        start_row = node.start_row,
        start_col = 0,
        end_row = node.start_row,
        end_col = node.start_col,
    }
    local raw_start_pos = first and str.width(first:sub(1, node.start_col))
        or node.start_col
    local concealed = self.context.conceal:get(temp_concealed_node)
    local display_start_pos = raw_start_pos - concealed
    return display_start_pos > 0 and display_start_pos or 0
end

---@param dist_lines string[]
---@param lines string[]
---@param display_start_pos integer
---@param max_line_width integer
function Handler:combine_lines(
    dist_lines,
    lines,
    display_start_pos,
    max_line_width
)
    for i = 1, #lines do
        local prefix = str.pad(display_start_pos - str.width(dist_lines[i]))
        local suffix = str.pad(max_line_width - str.width(lines[i]))
        dist_lines[i] = (dist_lines[i] or '') .. prefix .. lines[i] .. suffix
    end
end

---@param marks render.md.Marks
---@param node render.md.Node
---@param lines string[]
---@param above boolean
function Handler:render_virtual_lines(marks, node, lines, above)
    if #lines == 0 then
        return
    end

    local text = {} ---@type string[]
    if above then
        for _ = 1, self.config.top_pad do
            text[#text + 1] = ''
        end
    end
    for _, line in ipairs(lines) do
        text[#text + 1] = line
    end
    if not above then
        for _ = 1, self.config.bottom_pad do
            text[#text + 1] = ''
        end
    end

    local indent = self:indent(node.start_row, node.start_col)
    local virt_lines = iter.list.map(text, function(part)
        local line = vim.list_extend({}, indent) ---@type render.md.mark.Line
        line[#line + 1] = { part, self.config.highlight }
        return line
    end)

    marks:add(self.config, 'virtual_lines', node.start_row, 0, {
        virt_lines = virt_lines,
        virt_lines_above = above,
    })
end

---@param marks render.md.Marks
---@param node render.md.Node
---@param baseline string
function Handler:render_baseline(marks, node, baseline)
    marks:over(self.config, true, node, {
        virt_text = { { baseline, self.config.highlight } },
        virt_text_pos = 'inline',
        conceal = '',
    })
end

---@private
---@param text string
---@return string
function Handler:convert(text)
    local result = Handler.cache[text]
    if not result then
        local converter = self.config.converter
        result = vim.fn.system(converter, text)
        if vim.v.shell_error == 1 then
            log.add('error', 'ConverterFailed', converter, result)
            result = 'error'
        end
        Handler.cache[text] = result
    end
    return result
end

---@private
---@param row integer
---@param col integer
---@return render.md.mark.Line
function Handler:indent(row, col)
    local buf = self.context.buf
    local node = vim.treesitter.get_node({
        bufnr = buf,
        pos = { row, col },
        lang = 'markdown',
    })
    if not node then
        return {}
    end
    return Indent.new(self.context, Node.new(buf, node)):line(true):get()
end

---@class render.md.handler.Latex: render.md.Handler
local M = {}

---@param ctx render.md.handler.Context
---@return render.md.Mark[]
function M.parse(ctx)
    return Handler.new(ctx.buf):run(ctx.root)
end

return M

Here are 2 implementation options. Both options work perfectly, but there are
nuances:

  • Variant 1. I tried to separate the logic of combining and rendering, and
    because of this, there is a logical error - Handler:get_display_start_pos
    DOES NOT account for concealed formulas because at the time of calling
    conceal:get, they are not yet concealed. Therefore, I am forced to account
    for them myself in the combine function.

  • Variant 2. An alternative approach is to combine rendering with
    combining. In this case, conceal:get will account for concealed formulas,
    but in the current implementation of conceal.lua, conceal:get returns not
    display width, but byte width, which means it will be necessary to fix it
    or add an alternative (e.g. conceal:get_display). Here is a TEMPORARY
    fix for testing:

    request/conceal.lua:

    --!WARN: This is just an example! Use `vim.api.nvim_buf_get_lines` here is not the best way.
    --
    ---@param node render.md.Node
    ---@return integer
    function Conceal:get(node)
        local result = 0
        local col = { node.start_col, node.end_col } ---@type render.md.Range
        local line = vim.api.nvim_buf_get_lines(
            self.buf,
            node.start_row,
            node.end_row + 1,
            true
        )[1]
        for _, section in ipairs(self:line(node).sections) do
            if interval.overlaps(section.col, col, true) then
                local section_line = line:sub(section.col[1] + 1, section.col[2])
                local section_width = str.width(section_line)
                local width = section_width - self:width(section.character)
                result = result + width
            end
        end
        return result
    end

Both options work. The first option is better logically separated, the second
is simpler to understand (from my point of view) and, most likely, slightly
more performant.

Testing

Test file I used:

# Latex test

1. **Hidden** \__symbols_`:` $A = \begin{pmatrix}a& b\\& \\c& d\end{pmatrix}$.
   - _italic_: $A = \begin{pmatrix}a\\c\end{pmatrix}$
   - **bold**: $A = \begin{pmatrix}a\\c\end{pmatrix}$
   - `code` $A = \begin{pmatrix}a\\c\end{pmatrix}$
   - \*\*not bold\*\* $A = \begin{pmatrix}a\\c\end{pmatrix}$
   - **Hidden symbols in another line** _lorem ipsum dolor amet lorem **ipsum
     dolor** amet_: $A = \begin{pmatrix}a& b\\& \\c& d\end{pmatrix}$.
2. Two or more multi-line formulas in one virtual line (combine):
   - if $\frac{a}{b}$ = $\frac{c}{d}$ = $\frac{e}{f}$
   - $\begin{pmatrix}a\\c\end{pmatrix}$ and
     $\begin{pmatrix}b\\d\\e\end{pmatrix}$
3. Non-ASCII symbols:
   - "ыыы₁₁₁" $A = \begin{pmatrix}a& b\\& \\c& d\end{pmatrix}$.
   - $\begin{pmatrix}a₁\\a₂\end{pmatrix}$ and $\begin{pmatrix}b₁\\b₂\\b₃\end{pmatrix}$
4. Different height:
   - $\begin{pmatrix}a₁\\a₂\end{pmatrix}$ and $\frac{ \sum_{i=0}^{N} (X_i - \bar{X})^2 }{ \prod_{j=1}^{M} \frac{1}{j!} }$ and $\begin{pmatrix} \text{Very} \\ \text{many} \\ \text{lines} \\ \text{in} \\ \text{matrix} \end{pmatrix}$.
5. Multi-line formula:
   - with `$`: $A = \begin{pmatrix} a₁ & b₁ \\ & \\a₂&b₂\end{pmatrix}
      = \begin{pmatrix}c₁\\ \\ c₂\end{pmatrix}$
   - with `$$`: $$
     A = \begin{pmatrix} a₁ & b₁ \\ & \\a₂&b₂\end{pmatrix}
       = \begin{pmatrix}c₁\\ \\ c₂\end{pmatrix}
     $$ $last node$

Configuration for tests

lazy.nvim:

return {
  "MeanderingProgrammer/render-markdown.nvim",
  ft = { "markdown", "Avante" },
  dependencies = {
    "nvim-treesitter/nvim-treesitter",
    "nvim-tree/nvim-web-devicons",
  },
  opts = {
    latex = {
      position = "center",
      converter = "utftex",
      top_pad = 0,
      bottom_pad = 0,
    },
  },
}

What NEEDS to be added

  1. WARNING: In the current implementation, the last latex line is not
    rendered! This is easy to fix by adding something like a call to the
    after() function or adding a counter for current nodes to the context. But
    I'll leave this implementation to you, as I'm not sure how best to proceed.
  2. Naturally, add ---@field latex_buffer render.md.Node[] to context. I
    haven't found better solutions to store this outside of the context. Global
    and local variables can have bugs.
  3. I have not implemented the virtual = false logic. Sorry, I'm just tired.
    It's not that hard to implement, just a bit tedious.
  4. It's necessary to update config/latex.lua. I added position = 'center'.
  5. Naturally, write tests.
  6. I also strongly recommend using convert = 'utftex' by default or adding it
    to the documentation as an alternative. I haven't seen anything better than
    utftex yet, but there might be something. Link to
    github and
    brew

What CAN be added

  1. colorize can be added. As I saw in nabla.nvim, it's relatively simple to
    implement - just iterating through lines character by character and coloring
    based on whether it's a digit, symbol, etc. It doesn't look very performant,
    so the solution is generally debatable.

Comparison with nabla.nvim

render-markdown:

Image

nabla.nvim:

Image

Config for tests

local nabla = 0
-- !NOTE: 0 - render-markdown
--        1 - nabla
if nabla >= 1 then
  return {
    "MeanderingProgrammer/render-markdown.nvim",
    dependencies = {
      "nvim-treesitter/nvim-treesitter",
      "nvim-tree/nvim-web-devicons",
      {
        "jbyuki/nabla.nvim",
        dependencies = {
          "williamboman/mason.nvim",
          "nvim-treesitter/nvim-treesitter",
        },
      },
    },
    opts = {
      latex = { enabled = false },
      win_options = { conceallevel = { rendered = 2 } },
      on = {
        render = function()
          require("nabla").enable_virt { autogen = true }
        end,
        clear = function()
          require("nabla").disable_virt()
        end,
      },
    },
  }
end
return {
  "MeanderingProgrammer/render-markdown.nvim",
  dependencies = {
    "nvim-treesitter/nvim-treesitter",
    "nvim-tree/nvim-web-devicons",
  },
  opts = {
    latex = {
      position = "center",
      converter = "utftex",
      top_pad = 0,
      bottom_pad = 0,
    },
  },
}

Results

Pros:

  • Much more performant.
  • Correct indentation with Non-ASCII characters.
  • Supports the entire Latex syntax (nabla - not)

Cons:

  • No colors

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions