密集运算型任务选WASM还是WebWorker?

前言

最近要做一个需求，要在前端读取xlsx文件，校验内部的数据，这个文档数据量会在1w条左右，根绝业务需求会产生多种校验规则，比如数据不能重复，选择的文件要是符合要求的格式，某些字段有特殊的校验规则等，在数据量达到一定程度的时候就比较考验你写的校验算法性能，由于js单线程的特性，直接写到js中可能会以为过长的计算时间（在不同性能的设备上表现会有所不同）而阻断页面渲染，用户看起来就像是卡住了。

于是我就需要考虑使用webworker或者wasm来处理这类密集运算型的任务，webworker是js的多线程实现，能够将js的计算任务放到一个独立的线程中去执行，而wasm是编译型语言的二进制格式，能够将编译型语言的性能优势带到js中来，他们的共同特点就是都不会在主线程执行，避开影响页面渲染刷新。

我为了验证一下这两者的性能差异，做了一个简单的对比测试，测试的内容是在两边计算斐波拉契数列。

测试webworker性能

Web Worker 是一种在浏览器中实现多线程的技术，允许你在主线程之外创建后台线程，执行耗时的计算任务，而不会阻塞用户界面的渲染和交互。

测试代码：


const start = performance.now();
const worker = new Worker(URL.createObjectURL(new Blob([`
    self.onmessage = function (e) {
        const funcString = e.data.func;
        const args = e.data.args;
        const func = new Function('return ' + funcString)();
        const result = func(...args);
        self.postMessage(result);
    };
`], { type: 'application/javascript' })));
const validFunction = (totalNum: number) => {
    function fibonacci(n: number): number {
    if (n <= 1) {
        return n;
    }
    return fibonacci(n - 1) + fibonacci(n - 2);
    }
    const result = fibonacci(totalNum);
    return result;
}
const functionString = validFunction.toString();
 
worker.postMessage({ func: functionString, args: [num] });
worker.onmessage = function (e) {
    const end = performance.now();
    console.log('Worker 计算n='+num+',得到结果'+e.data+',耗时:', end - start);
    resolve(e.data);
    worker.terminate();
};
worker.onerror = function (error) {
    console.error('Worker 发生错误:', error.message);
    reject(error);
};

测试结果

测试1-46的斐波拉契数列，需要的耗时如下：

n	结果	耗时（ms）
n=1	得到结果1	耗时: 6
n=2	得到结果1	耗时: 5
n=3	得到结果2	耗时: 8
n=4	得到结果3	耗时: 6
n=5	得到结果5	耗时: 6
n=6	得到结果8	耗时: 5
n=7	得到结果13	耗时: 5
n=8	得到结果21	耗时: 3
n=9	得到结果34	耗时: 9
n=10	得到结果55	耗时: 4
n=11	得到结果89	耗时: 4
n=12	得到结果144	耗时: 2
n=13	得到结果233	耗时: 5
n=14	得到结果377	耗时: 5
n=15	得到结果610	耗时: 7
n=16	得到结果987	耗时: 5
n=17	得到结果1597	耗时: 6
n=18	得到结果2584	耗时: 5
n=19	得到结果4181	耗时: 5
n=20	得到结果6765	耗时: 6
n=21	得到结果10946	耗时: 6
n=22	得到结果17711	耗时: 7
n=23	得到结果28657	耗时: 5
n=24	得到结果46368	耗时: 8
n=25	得到结果75025	耗时: 8
n=26	得到结果121393	耗时: 11
n=27	得到结果196418	耗时: 13
n=28	得到结果317811	耗时: 16
n=29	得到结果514229	耗时: 13
n=30	得到结果832040	耗时: 28
n=31	得到结果1346269	耗时: 38
n=32	得到结果2178309	耗时: 54
n=33	得到结果3524578	耗时: 72
n=34	得到结果5702887	耗时: 77
n=35	得到结果9227465	耗时: 142
n=36	得到结果14930352	耗时: 213
n=37	得到结果24157817	耗时: 326
n=38	得到结果39088169	耗时: 523
n=39	得到结果63245986	耗时: 816
n=40	得到结果102334155	耗时: 1309
n=41	得到结果165580141	耗时: 2093
n=42	得到结果267914296	耗时: 3388
n=43	得到结果433494437	耗时: 5679
n=44	得到结果701408733	耗时: 8742
n=45	得到结果1134903170	耗时: 13986
n=46	得到结果1836311903	耗时: 22875

测试wasm性能

WebAssembly（Wasm）是一种二进制指令格式，简单来说就是浏览器可以运行的二进制代码，就像早些年ie浏览器里面的activex控件一样， wasm是编译型语言的二进制格式，能够将编译型语言的性能优势带到js中来，wasm的运行速度比js快很多倍，wasm的二进制格式也比js的文本格式小很多。

测试代码

本次我使用rust来编译wasm测试，测试代码如下：


use wasm_bindgen::prelude::*;
 
#[wasm_bindgen]
pub fn fibonacci_wasm(n: u32) -> u32 {
    if n <= 1 {
        n
    } else {
        fibonacci_wasm(n - 1) + fibonacci_wasm(n - 2)
    }
}

使用 wasm-pack build --target web --release 命令编译成wasm文件，使用 wasm-bindgen 来生成js的绑定代码。

最后得到一个编译好的wasm文件和一个js的绑定文件，

其中的js文件就是wasm的绑定代码，wasm文件就是编译好的二进制文件。另外两个文件是wasm的类型声明文件和wasm的js绑定代码的类型声明文件。

我把编译产出目录pkg拷贝到自己的测试项目中，因为打包产出有package.json：


{
  "name": "fibonacci-wasm",
  "type": "module",
  "version": "0.1.0",
  "files": [
    "fibonacci_wasm_bg.wasm",
    "fibonacci_wasm.js",
    "fibonacci_wasm.d.ts"
  ],
  "main": "fibonacci_wasm.js",
  "types": "fibonacci_wasm.d.ts",
  "sideEffects": [
    "./snippets/*"
  ]
}

所以我在我的测试项目中的pacage.json直接使用这种方式安装：


"fibonacci-wasm": "file:src/pkg",

为了解决静态检查部识别包，执行一次 npm i

使用时直接使用了:


import init from 'fibonacci-wasm';
init().then((wasm: any) => {
    const validFunction = wasm.fibonacci_wasm;
    const start = performance.now();
    const result = validFunction(num);
    const end = performance.now();
    console.log('WASM 计算n='+num+',得到结果'+result+',耗时:', end - start);
    return result;
});

测试结果

n	结果	耗时（ms）
n=1	得到结果1	耗时: 0
n=2	得到结果1	耗时: 0
n=3	得到结果2	耗时: 0
n=4	得到结果3	耗时: 0
n=5	得到结果5	耗时: 0
n=6	得到结果8	耗时: 0
n=7	得到结果13	耗时: 0
n=8	得到结果21	耗时: 0
n=9	得到结果34	耗时: 0
n=10	得到结果55	耗时: 0
n=11	得到结果89	耗时: 0
n=12	得到结果144	耗时: 0
n=13	得到结果233	耗时: 0
n=14	得到结果377	耗时: 0
n=15	得到结果610	耗时: 0
n=16	得到结果987	耗时: 0
n=17	得到结果1597	耗时: 0
n=18	得到结果2584	耗时: 0
n=19	得到结果4181	耗时: 0
n=20	得到结果6765	耗时: 1
n=21	得到结果10946	耗时: 0
n=22	得到结果17711	耗时: 1
n=23	得到结果28657	耗时: 0
n=24	得到结果46368	耗时: 1
n=25	得到结果75025	耗时: 1
n=26	得到结果121393	耗时: 1
n=27	得到结果196418	耗时: 1
n=28	得到结果317811	耗时: 5
n=29	得到结果514229	耗时: 6
n=30	得到结果832040	耗时: 10
n=31	得到结果1346269	耗时: 10
n=32	得到结果2178309	耗时: 25
n=33	得到结果3524578	耗时: 25
n=34	得到结果5702887	耗时: 51
n=35	得到结果9227465	耗时: 70
n=36	得到结果14930352	耗时: 101
n=37	得到结果24157817	耗时: 150
n=38	得到结果39088169	耗时: 212
n=39	得到结果63245986	耗时: 359
n=40	得到结果102334155	耗时: 567
n=41	得到结果165580141	耗时: 904
n=42	得到结果267914296	耗时: 1456
n=43	得到结果433494437	耗时: 2314
n=44	得到结果701408733	耗时: 3749
n=45	得到结果1134903170	耗时: 6031
n=46	得到结果1836311903	耗时: 9893

结果评比

因为我电脑是m1pro芯片，性能还算强劲，但是计算到n=46时，已经比较耗时了，斐波拉契计算n每增加1，计算时长增加60%左右，继续计算下去也没有意义了，所以只测试到46，实际上很多电脑性能是不如m1pro芯片的，所以在不同性能的电脑上，wasm和webworker的耗时不会和我的一样。

我们来看看结果横向对比：

n	结果	webworker耗时（ms）	wasm耗时（ms）
n=1	得到结果1	耗时: 6	耗时: 0
n=2	得到结果1	耗时: 5	耗时: 0
n=3	得到结果2	耗时: 8	耗时: 0
n=4	得到结果3	耗时: 6	耗时: 0
n=5	得到结果5	耗时: 6	耗时: 0
n=6	得到结果8	耗时: 5	耗时: 0
n=7	得到结果13	耗时: 5	耗时: 0
n=8	得到结果21	耗时: 3	耗时: 0
n=9	得到结果34	耗时: 9	耗时: 0
n=10	得到结果55	耗时: 4	耗时: 0
n=11	得到结果89	耗时: 4	耗时: 0
n=12	得到结果144	耗时: 2	耗时: 0
n=13	得到结果233	耗时: 5	耗时: 0
n=14	得到结果377	耗时: 5	耗时: 0
n=15	得到结果610	耗时: 7	耗时: 0
n=16	得到结果987	耗时: 5	耗时: 0
n=17	得到结果1597	耗时: 6	耗时: 0
n=18	得到结果2584	耗时: 5	耗时: 0
n=19	得到结果4181	耗时: 5	耗时: 0
n=20	得到结果6765	耗时: 6	耗时: 1
n=21	得到结果10946	耗时: 6	耗时: 0
n=22	得到结果17711	耗时: 7	耗时: 1
n=23	得到结果28657	耗时: 5	耗时: 0
n=24	得到结果46368	耗时: 8	耗时: 1
n=25	得到结果75025	耗时: 8	耗时: 1
n=26	得到结果121393	耗时: 11	耗时: 1
n=27	得到结果196418	耗时: 13	耗时: 1
n=28	得到结果317811	耗时: 16	耗时: 5
n=29	得到结果514229	耗时: 13	耗时: 6
n=30	得到结果832040	耗时: 28	耗时: 10
n=31	得到结果1346269	耗时: 38	耗时: 10
n=32	得到结果2178309	耗时: 54	耗时: 25
n=33	得到结果3524578	耗时: 72	耗时: 25
n=34	得到结果5702887	耗时: 77	耗时: 51
n=35	得到结果9227465	耗时: 142	耗时: 70
n=36	得到结果14930352	耗时: 213	耗时: 101
n=37	得到结果24157817	耗时: 326	耗时: 150
n=38	得到结果39088169	耗时: 523	耗时: 212
n=39	得到结果63245986	耗时: 816	耗时: 359
n=40	得到结果102334155	耗时: 1309	耗时: 567
n=41	得到结果165580141	耗时: 2093	耗时: 904
n=42	得到结果267914296	耗时: 3388	耗时: 1456
n=43	得到结果433494437	耗时: 5679	耗时: 2314
n=44	得到结果701408733	耗时: 8742	耗时: 3749
n=45	得到结果1134903170	耗时: 13986	耗时: 6031
n=46	得到结果1836311903	耗时: 22875	耗时: 9893

可以看到，wasm的性能比webworker快了很多，js在哪怕n=1，他也要耗时几毫秒，在运算量不大的情况下，虽然有差距，但是可以忽略，当n达到35时（不同cpu性能有差异），耗时差距变大，会有明显感知差异。越往后差距越大，wasm的性能优势越明显。

从上面测试结果可以看到wasm在计算性能上比webworker要快得多得多，虽然上面结果只快1倍，原因主要是rust最强的是高效的内存、原生编译代码，没有动态类型开销，这里的测试函数使用的是递归算法而不是迭代算法，并不是展现wasm性能的最佳场景。不论怎么说，wasm的性能都比js要快得多。

实际应用场景怎么选？

根据上面测试结果，实际应用中首选wasm来处理密集运算型任务呢？答案是：不一定。

在实际的开发中，使用wasm对团队来说成本升高，主要体现在：

新语言的学习成本
跨语言调用的复杂性、调试不便利

相比之下，webworker的优势在于：

简单易用，不需要学习新语言
可以直接使用js的api、js的依赖库

究竟选择webworker还是wasm，还是要看具体的业务场景和团队的技术栈，比如说虽然worker性能差一些，但是我会跟你说处理1w条xlsx数据的校验，最后我实测也仅仅花了几百毫秒而已，完全可以接受的范围内，加入有一天我的需求是校验100w条数据，那我肯定就考虑wasm而不是webworker了。

另外还有一些使用wasm的情况是因为移植需要，比如c++的库需要直接应用到web上。还比如我曾经做过一个地图应用，根据gps在地图上画点画线，当gps数据太多的时候，加上mapbox性能就不太好了，这种场景也适合使用wasm来处理一些运算逻辑。