Astro 中通过 Remark 插件兼容自定义语法

Jun 14, 2026 13 min read

写在前面

博客从 Hexo 迁到 Astro 之后，几百篇 Markdown 文章基本都能直接复用，但当年扩展的一些自定义语法却出了问题

其中一类是 Live Photo，博主之前在《Hexo 中实现 Live Photos 支持》里写过，通过过滤器把下面这种写法转成实况

![示例图](1.jpg)(1.mov)

另一类则是年度总结里常用的 hexo-tag-aplayer 插件标签

{% aplayer 歌名 歌手 音频链接 封面链接 %}
{% meting "1930226368" "netease" "song" %}

在 Hexo 里，这些功能靠 hexo.extend.filter 或 hexo.extend.tag 实现，渲染阶段直接替换成 HTML 就行

但迁到 Astro 之后，{% aplayer %} 要么原样输出在页面上，要么被 Markdown 解析器拆得七零八落，Live Photo 的 ![alt](img)(video) 写法也匹配不上

博主也试过走 MDX，把标签改成 <APlayer /> 这类组件，但所有旧文章都要改写法，仅仅为了几个自定义标签引入整站 MDX，多少有些得不偿失

后来才想起来，Astro 解析 Markdown 底层用的就是 Remark，那完全可以在构建阶段写 Remark 插件，把旧语法在编译时转成 HTML，文章内容不用动

注册 Remark 插件

思路并不复杂，在项目里写好插件，然后在 astro.config.mjs 里注册即可

import { defineConfig } from 'astro/config';
import { unified } from '@astrojs/markdown-remark';
import { remarkLivePhoto } from './remark/live-photo.mjs';
import { remarkAplayer } from './remark/aplayer.mjs';

export default defineConfig({
  markdown: {
    processor: unified({
      // 实际项目还可按需注册 remarkReadingTime 等插件
      remarkPlugins: [remarkLivePhoto, remarkAplayer],
    }),
  },
});

插件会在每篇文章编译时跑一遍，把匹配到的自定义语法替换成 HTML，最终直接打进静态页面里

为什么不能直接正则替换

Hexo 过滤器里拿到的是完整 Markdown 字符串，一条正则就能搞定

data.content = data.content.replace(
  /!\[(.*?)\]\((.*?)\)\((.*?)\)/g,
  (match, alt, img, video) => { /* 生成 HTML */ }
);

但 Remark 插件拿到的不是原始文本，而是解析后的 mdast（Markdown AST）

比如 ![示例图](1.jpg)(1.mov) 并不会保留成一整段字符串，而是被拆成 image 和 text 两个节点，后面的 (1.mov) 单独落在 text 里

{
  type: 'root',
  children: [
    {
      type: 'paragraph',
      children: [
        {
          type: 'image',
          title: null,
          url: '1.jpg',
          alt: '示例图',
          position: {
            start: { line: 1, column: 1, offset: 0 },
            end: { line: 1, column: 14, offset: 13 }
          }
        },
        {
          type: 'text',
          value: '(1.mov)',
          position: {
            start: { line: 1, column: 14, offset: 13 },
            end: { line: 1, column: 21, offset: 20 }
          }
        }
      ],
      position: {
        start: { line: 1, column: 1, offset: 0 },
        end: { line: 1, column: 21, offset: 20 }
      }
    }
  ],
  position: {
    start: { line: 1, column: 1, offset: 0 },
    end: { line: 1, column: 21, offset: 20 }
  }
}

若视频路径是完整 URL，还会被 autolink 进一步拆开：

// ![示例图](1.jpg)(https://example.com/1.mov) 解析结果
{
  type: 'paragraph',
  children: [
    { type: 'image', url: '1.jpg', alt: '示例图' },
    { type: 'text', value: '(' },
    { type: 'link', url: 'https://example.com/1.mov' },
    { type: 'text', value: ')' },
  ],
}

{% aplayer ... %} 里的 URL 也会被 autolink 单独拆成 link 节点

Live Photo 这边也一样，视频路径若是完整 URL，(1.mov) 可能被拆成 text: '(' + link + text: ')' 三个节点，而相对路径 1.mov 则通常仍保留为 text: '(1.mov)'

最开始博主也是直接在 text 节点上跑正则，结果迁过来的文章大量匹配失败，没办法，只能把 mdast 打出来看一眼，才发现问题出在这里

兼容 Live Photo

Live Photo 插件只处理 paragraph 节点，核心思路是把段落里的子节点先拼回字符串，正则替换完再拆回 mdast

paragraphToText 负责还原被拆开的节点：text 取 value，image 拼回 ![alt](url)；若视频路径被 autolink 转成 link，则拼回 (url)，并跳过 link 两侧多余的括号 text 节点

function paragraphToText(node) {
  if (!node?.children) return '';
  const children = node.children;
  const parts = [];
  for (let i = 0; i < children.length; i++) {
    const child = children[i];
    const prev = children[i - 1];
    const next = children[i + 1];
    if (child.type === 'text' && child.value === '(' && next?.type === 'link') continue;
    if (child.type === 'text' && child.value === ')' && prev?.type === 'link') continue;
    if (child.type === 'text') parts.push(child.value);
    else if (child.type === 'link') parts.push(`(${child.url ?? ''})`);
    else if (child.type === 'image') parts.push(`![${child.alt ?? ''}](${child.url ?? ''})`);
  }
  return parts.join('');
}

拼回去之后就能重新得到 ![alt](img)(video) 这样的字符串，再用和 Hexo 时期一样的正则匹配

const LIVE_PHOTO_RE = /!\[([^\]]*)\]\(([^)]+)\)\(([^)]+)\)/g;
const FIGURE_OPEN = '<figure class="live-photo-container"';
const FIGURE_CLOSE = '</figure>';

命中后 toHtml 会生成带 data-live-photo 的 HTML，属性值经过 escapeHtmlAttr 转义

function escapeHtmlAttr(str) {
  if (!str) return '';
  return str.replace(/[&<>"']/g, (c) => ({ '&': '&amp;', '<': '&lt;', '>': '&gt;', '"': '&quot;', "'": '&#039;' })[c] ?? c);
}

function toHtml(img, video, alt) {
  const cap = alt ? `<figcaption class="live-photo-caption">${escapeHtmlAttr(alt)}</figcaption>` : '';
  return `<figure class="live-photo-container">
<div data-live-photo data-photo-src="${escapeHtmlAttr(img)}" data-video-src="${escapeHtmlAttr(video)}" class="live-photo-wrapper"></div>
${cap}
</figure>`;
}

最后 splitHtmlNodes 按 FIGURE_OPEN / FIGURE_CLOSE 切开，HTML 部分输出 html 节点，标签后面如果还有剩余文字，会单独保留成 text 节点

function splitHtmlNodes(value, openTag, closeTag) {
  if (!value.includes(openTag)) return [{ type: 'text', value }];
  const splitRe = new RegExp(`(?=${openTag.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')})`, 'g');
  return value.split(splitRe).filter(Boolean).flatMap((part) => {
    if (!part.startsWith(openTag)) return [{ type: 'text', value: part }];
    const closeIdx = part.indexOf(closeTag);
    if (closeIdx === -1) return [{ type: 'html', value: part }];
    const nodes = [{ type: 'html', value: part.slice(0, closeIdx + closeTag.length) }];
    const remainder = part.slice(closeIdx + closeTag.length);
    if (remainder) nodes.push({ type: 'text', value: remainder });
    return nodes;
  });
}

插件入口 remarkLivePhoto 只 visit 段落节点，在回调里完成拼接、替换和拆节点三步

import { visit } from 'unist-util-visit';

export function remarkLivePhoto() {
  return (tree) => {
    visit(tree, 'paragraph', (node) => {
      const text = paragraphToText(node);
      if (!text.includes('![')) return;
      let changed = false;
      const value = text.replace(LIVE_PHOTO_RE, (_, alt, img, video) => {
        changed = true;
        return toHtml(img || '', video || '', alt || '');
      });
      if (changed) node.children = splitHtmlNodes(value, FIGURE_OPEN, FIGURE_CLOSE);
    });
  };
}

客户端渲染由 LivePhotoInit 组件负责：页面里检测到 [data-live-photo] 后，再动态加载 LivePhotosKit JS，区别只是 HTML 的生成时机从 Hexo 渲染阶段挪到了 Astro 构建阶段

兼容 APlayer

APlayer 插件除了 paragraph，还会 visit text 和 html 节点，避免标签落在不同节点类型里时被漏掉

paragraphToText 和 Live Photo 类似，但只处理 text 和 link，link 直接取 url 拼回去，因为标签参数里的链接本来就不是 Markdown 链接写法

function paragraphToText(node) {
  return (node.children ?? [])
    .map((c) => {
      if (c.type === 'text') return c.value;
      if (c.type === 'link') return c.url ?? '';
      return '';
    })
    .join('');
}

匹配标签用的正则是

const TAG_RE = /\{%\s*(aplayer|meting)\s+([\s\S]*?)\s*%\}/g;
const CONTAINER_OPEN = '<div class="aplayer-container';

{% aplayer %} 走 parseAplayer，从参数里捞出最后两个 http(s):// 链接作为音频和封面，前面的文字按空格拆成歌名和歌手，封面后面的 autoplay、fixed 等附加参数交给 parseOptions 解析

function parseAplayer(content) {
  const trimmed = content.trim();
  const urls = [...trimmed.matchAll(/https?:\/\/[^\s]+/g)].map((m) => m[0]);
  if (urls.length < 2) return null;
  const url = urls.at(-2);
  const cover = urls.at(-1);
  const before = trimmed.slice(0, trimmed.indexOf(url)).trim().split(/\s+/).filter(Boolean);
  if (before.length < 2) return null;
  const artist = before.pop();
  const name = before.join(' ');
  const extras = trimmed.slice(trimmed.indexOf(cover) + cover.length).trim().split(/\s+/).filter(Boolean);
  return { audio: { name, artist, url, cover }, playerOptions: parseOptions(extras) };
}

{% meting %} 走 parseMeting，用正则逐个提取引号包裹的参数，同样兼容 autoplay、fixed、mini、theme:#xxx 这类 hexo-tag-aplayer 写法

function parseMeting(content) {
  const args = [];
  const re = /\u201c([^\u201d]*)\u201d|\u2018([^\u2019]*)\u2019|"([^"]*)"|'([^']*)'|(\S+)/g;
  const trim = /^[\s\u201c\u201d\u2018\u2019"']+|[\s\u201c\u201d\u2018\u2019"']+$/g;
  let m;
  while ((m = re.exec(content)) !== null) {
    args.push((m[1] ?? m[2] ?? m[3] ?? m[4] ?? m[5]).replace(trim, ''));
  }
  if (args.length < 3) return null;
  const [id, server, type, ...extras] = args;
  return { id, server, type, playerOptions: parseOptions(extras) };
}

parseOptions 兼容 hexo-tag-aplayer 的布尔开关和 key:value 写法，narrow 映射为 mini，listfolded 映射为 listFolded，还支持 listMaxHeight、lrcType、storageName 等别名

const BOOL_FLAGS = new Set(['autoplay', 'fixed', 'mini', 'narrow', 'listfolded']);
const ALIASES = { listmaxheight: 'listMaxHeight', listfolded: 'listFolded', lrctype: 'lrcType', storagename: 'storageName' };

function parseOptions(extras) {
  const opts = {};
  for (const item of extras) {
    if (BOOL_FLAGS.has(item)) {
      if (item === 'narrow') opts.mini = true;
      else if (item === 'listfolded') opts.listFolded = true;
      else opts[item] = true;
      continue;
    }
    const i = item.indexOf(':');
    if (i === -1) continue;
    const key = ALIASES[item.slice(0, i).toLowerCase()] ?? item.slice(0, i).toLowerCase();
    const val = item.slice(i + 1);
    if (key === 'volume') {
      const n = Number(val);
      if (Number.isFinite(n)) opts[key] = n;
    } else if (key === 'mutex') {
      opts[key] = val === 'true';
    } else {
      opts[key] = val;
    }
  }
  return opts;
}

replace 根据标签名分发到不同解析函数，合并 DEFAULTS 默认值后，再由 toHtml 按模式生成容器

function escapeHtmlAttr(str) {
  if (!str) return '';
  return str.replace(/[&<>"']/g, (c) => ({ '&': '&amp;', '<': '&lt;', '>': '&gt;', '"': '&quot;', "'": '&#039;' })[c] ?? c);
}

const DEFAULTS = { fixed: false, theme: '#b7daff', loop: 'all', order: 'list', volume: 0.7, autoplay: false };

function toHtml(mode, payload) {
  const fixed = (mode === 'meting' ? payload.playerOptions : payload)?.fixed;
  const attr =
    mode === 'meting'
      ? `data-aplayer-mode="meting" data-meting-options="${escapeHtmlAttr(JSON.stringify(payload))}"`
      : `data-aplayer-mode="direct" data-aplayer-options="${escapeHtmlAttr(JSON.stringify(payload))}"`;
  return `<div class="aplayer-container${fixed ? ' aplayer-container--fixed' : ''}" ${attr}></div>`;
}

function replace(text) {
  if (!text?.includes('{%')) return { value: text, changed: false };
  let changed = false;
  const value = text.replace(TAG_RE, (full, tag, args) => {
    if (tag === 'aplayer') {
      const p = parseAplayer(args);
      if (!p) return full;
      changed = true;
      return toHtml('direct', { audio: [p.audio], ...DEFAULTS, ...p.playerOptions });
    }
    if (tag === 'meting') {
      const p = parseMeting(args);
      if (!p) return full;
      changed = true;
      return toHtml('meting', { id: p.id, server: p.server, type: p.type, playerOptions: { ...DEFAULTS, ...p.playerOptions } });
    }
    return full;
  });
  return { value, changed };
}

toNodes 把替换后的字符串拆回 mdast，插件入口依次处理 paragraph、text、html 三种节点

import { visit } from 'unist-util-visit';

function toNodes(text) {
  if (!text.includes(CONTAINER_OPEN)) return [{ type: 'text', value: text }];
  return text.split(/(?=<div class="aplayer-container)/).filter(Boolean).map((part) =>
    part.startsWith(CONTAINER_OPEN) ? { type: 'html', value: part } : { type: 'text', value: part },
  );
}

export function remarkAplayer() {
  return (tree) => {
    visit(tree, 'paragraph', (node) => {
      const text = paragraphToText(node);
      if (!text.includes('{%')) return;
      const { value, changed } = replace(text);
      if (changed) node.children = toNodes(value);
    });
    visit(tree, 'text', (node, i, parent) => {
      const { value, changed } = replace(node.value);
      if (changed && parent?.children) parent.children.splice(i, 1, ...toNodes(value));
    });
    visit(tree, 'html', (node) => {
      const { value, changed } = replace(node.value);
      if (changed) node.value = value;
    });
  };
}

插件本身只负责输出带 data-* 配置的容器，页面加载后由 APlayerInit 组件读取这些属性，再完成播放器初始化

迁移完成后，原有写法无需任何修改

{% aplayer 你离开了南京,从此没有人和我说话 李志 https://.../audio.mp3 https://.../cover.jpg %}

{% meting "1930226368" "netease" "song" %}
{% meting "1930226368" "netease" "song" "autoplay" "fixed" %}

meting 的 autoplay、fixed、mini 等参数会自动解析成对应播放器配置，未指定时还会合并 DEFAULTS 里的默认主题色、循环模式、音量等，与 Hexo 时期保持一致

文首的播放器就是按旧写法直接渲染的，迁移前后不用改一个字~

写在后面

折腾下来最大的体会是，Remark 会先把你写的 Markdown 转成 mdast，很多看起来完整的语法，到插件里其实已经面目全非了

如果遇到类似需求，建议先把 AST 结构打印出来看看，再决定从哪个节点下手~