利用Envoy代理实现遗留C++服务与现代tRPC技术栈的异构集成


定义问题:无法触碰的高性能C++核心

我们面临一个典型的技术债场景。系统核心是一个用C++编写的、基于自定义TCP二进制协议的高性能计算服务。它稳定、高效,但同时也成了一个技术孤岛:没有文档,原始开发者早已离职,任何对其代码的改动都意味着巨大的风险与回归测试成本。

业务需求是为该服务开发一个新的Web前端界面,提供实时数据展示和控制功能。技术团队决定采用现代技术栈:前端使用Vue.js,后端采用ASP.NET Core构建BFF(Backend for Frontend),API层则选用tRPC以获取端到端的类型安全。

核心矛盾点在于:如何让基于HTTP和JSON的现代Web技术栈,与这个基于自定义TCP二进制协议的遗留C++服务安全、高效地通信,并且整个架构必须具备良好的可观测性、韧性和未来的可扩展性。

方案A:直接构建专用适配器服务

第一个进入脑海的方案是创建一个专门的适配器服务。这个服务,同样用ASP.NET Core实现,其核心职责是双向协议转换。

graph TD
    A[Vue.js Frontend] -- tRPC/HTTP --> B[ASP.NET Core BFF];
    B -- REST/HTTP --> C[Adapter Service];
    C -- Custom TCP --> D[Legacy C++ Service];

实现思路:

  1. 在Adapter Service中,硬编码实现对C++服务私有TCP协议的解析与封装逻辑。
  2. Adapter Service对内暴露标准的RESTful API或gRPC接口。
  3. BFF服务通过调用Adapter Service的API来间接与C++服务交互。

优势分析:

  • 逻辑集中: 所有丑陋的、非标的协议处理逻辑都被封装在Adapter Service内部,BFF层可以保持干净。
  • 快速实现: 对于一个简单的场景,这可能是最快完成任务的方式。

劣势分析:

  • 强耦合: Adapter Service与C++服务的实现细节(如IP、端口、协议格式)紧密耦合。如果C++服务需要迁移或进行网络策略变更,Adapter Service必须修改并重新部署。
  • 缺乏韧性: 服务发现、负载均衡、熔断、重试、超时等网络治理能力都需要在Adapter Service中手动编码实现。这是一项巨大的工作量,且容易出错。
  • 可观测性黑洞: 从BFF到C++服务的调用链路将在此中断。我们无法轻易获得这部分通信的延迟、成功率等关键指标。
  • 职责不清: Adapter Service同时承担了协议转换和网络通信策略的职责,违反了单一职责原则,随着时间推移会变得越来越难以维护。

在真实项目中,这种方案通常会演变成一个新的、难以维护的“胶水”单体,我们很快就否决了它。

方案B:引入服务网格进行协议转换与治理

该方案的核心思想是将网络通信逻辑从业务应用中剥离出来,下沉到基础设施层,由服务网格(Service Mesh)的边车代理(Sidecar Proxy)来处理。我们选择Envoy Proxy。

graph TD
    subgraph Frontend
        A[Vue.js App]
    end
    subgraph BFF Server
        B[ASP.NET Core BFF]
    end
    subgraph Legacy Server
        E[Legacy C++ Service]
    end
    subgraph Envoy Sidecars
        C[Envoy Proxy for BFF]
        D[Envoy Proxy for Legacy]
    end

    A -- tRPC/HTTP --> B;
    B -- gRPC (to localhost) --> C;
    C -- mTLS --> D;
    D -- Custom TCP (from localhost) --> E;

在这个架构中,Envoy承担了关键角色:

  1. 协议转换: BFF服务认为自己是在与一个标准的gRPC服务通信(通过本地的Envoy)。BFF侧的Envoy接收到gRPC请求后,通过内置的过滤器(例如Lua Filter)将其动态转换为C++服务能理解的二进制TCP流。
  2. 网络治理: 所有的重试、超时、熔断、负载均衡策略都在Envoy的配置文件中声明式地定义,无需在ASP.NET Core中编写任何代码。
  3. 可观测性: Envoy原生支持生成详细的Metrics、Logging和Tracing数据,可以无缝对接到Prometheus、ELK和Jaeger等系统中,实现全链路的可观测性。
  4. 安全: 服务间可以通过Envoy轻松实现mTLS加密通信,即使它们本身不支持TLS。

决策理由:
尽管引入Envoy增加了部署的复杂性,但它将网络问题与业务问题彻底解耦。这种架构的长期可维护性、扩展性和韧性远超方案A。它允许我们以一种对业务代码“无感”的方式,逐步现代化我们的基础设施。这是一个更符合长期主义的工程决策。

核心实现概览

我们将构建一个可运行的最小化示例来展示方案B的核心组件。假设C++服务提供一个GetMarketData功能,接收一个4字节的股票代码(如"GOOG"),返回一个结构化的二进制数据包。

1. 遗留C++服务 (Legacy C++ Service)

这是一个使用Boost.Asio实现的简单TCP服务器,它监听在9090端口,处理我们的自定义协议。

协议定义:

  • 请求: [4-byte Ticker Symbol] (e.g., ‘G’, ‘O’, ‘O’, ‘G’)
  • 响应: [4-byte Ticker Symbol][8-byte Price (double)][8-byte Volume (uint64_t)]

legacy_server.cpp

#include <iostream>
#include <string>
#include <boost/asio.hpp>
#include <boost/bind/bind.hpp>

using boost::asio::ip::tcp;

// A simple representation of our binary response structure
struct MarketData {
    char ticker[4];
    double price;
    uint64_t volume;
};

class TcpSession : public std::enable_shared_from_this<TcpSession> {
public:
    TcpSession(tcp::socket socket) : socket_(std::move(socket)) {}

    void start() {
        do_read();
    }

private:
    void do_read() {
        auto self(shared_from_this());
        boost::asio::async_read(socket_, boost::asio::buffer(read_msg_, 4),
            [this, self](boost::system::error_code ec, std::size_t length) {
                if (!ec) {
                    // In a real system, you would look up data based on the ticker.
                    // Here, we just generate some dummy data.
                    MarketData response_data;
                    memcpy(response_data.ticker, read_msg_, 4);
                    response_data.price = 175.50 + (rand() % 100) / 100.0;
                    response_data.volume = 1000000 + (rand() % 50000);
                    
                    std::string ticker_str(read_msg_, 4);
                    std::cout << "Received request for: " << ticker_str << ". Responding..." << std::endl;

                    do_write(response_data);
                } else {
                     std::cerr << "Read error: " << ec.message() << std::endl;
                }
            });
    }

    void do_write(const MarketData& data) {
        auto self(shared_from_this());
        boost::asio::async_write(socket_, boost::asio::buffer(&data, sizeof(MarketData)),
            [this, self](boost::system::error_code ec, std::size_t /*length*/) {
                if (!ec) {
                    // After writing, wait for the next request on the same connection.
                    do_read();
                } else {
                     std::cerr << "Write error: " << ec.message() << std::endl;
                }
            });
    }

    tcp::socket socket_;
    char read_msg_[4];
};

class TcpServer {
public:
    TcpServer(boost::asio::io_context& io_context, short port)
        : acceptor_(io_context, tcp::endpoint(tcp::v4(), port)) {
        do_accept();
    }

private:
    void do_accept() {
        acceptor_.async_accept(
            [this](boost::system::error_code ec, tcp::socket socket) {
                if (!ec) {
                    std::cout << "Accepted connection from " << socket.remote_endpoint() << std::endl;
                    std::make_shared<TcpSession>(std::move(socket))->start();
                }
                do_accept();
            });
    }

    tcp::acceptor acceptor_;
};

int main() {
    try {
        boost::asio::io_context io_context;
        TcpServer server(io_context, 9090);
        std::cout << "Legacy C++ server listening on port 9090..." << std::endl;
        io_context.run();
    } catch (std::exception& e) {
        std::cerr << "Exception: " << e.what() << "\n";
    }
    return 0;
}

编译运行: g++ legacy_server.cpp -o server -lboost_system -lboost_thread -lpthread && ./server

2. Protobuf与gRPC定义

我们定义一个gRPC服务接口,这将是ASP.NET Core BFF与Envoy之间的通信契约。

marketdata.proto

syntax = "proto3";

package marketdata;

service MarketDataService {
  rpc GetMarketData(MarketDataRequest) returns (MarketDataResponse);
}

message MarketDataRequest {
  string ticker = 1;
}

message MarketDataResponse {
  string ticker = 1;
  double price = 2;
  uint64 volume = 3;
}

3. Envoy配置与Lua协议转换脚本

这是整个架构的核心。envoy.yaml配置了一个监听8081端口的gRPC服务,并通过一个Lua过滤器将请求转换为TCP二进制流,发送到上游的C++服务(legacy_cpp_service)。

envoy.yaml

static_resources:
  listeners:
  - name: grpc_listener
    address:
      socket_address: { address: 0.0.0.0, port_value: 8081 }
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: grpc_http
          codec_type: AUTO
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match: { prefix: "/" }
                route: { cluster: translation_cluster }
          http_filters:
          - name: envoy.filters.http.grpc_web
            typed_config: {}
          - name: envoy.filters.http.router
            typed_config: {}

  clusters:
  - name: translation_cluster
    connect_timeout: 5s
    type: STATIC
    load_assignment:
      cluster_name: translation_cluster
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address: { address: 127.0.0.1, port_value: 10000 }
  
  # This is a TCP proxy cluster with an inline Lua script for protocol translation
  - name: legacy_cpp_service_cluster
    connect_timeout: 5s
    type: STATIC
    load_assignment:
      cluster_name: legacy_cpp_service_cluster
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address: { address: host.docker.internal, port_value: 9090 } # For Docker on Mac/Windows

# We'll create a separate listener for the TCP part with the Lua filter.
# The gRPC service will call this listener. A better way in production might be more complex filter chains.
# For simplicity here, we create a TCP listener that our BFF's Envoy would conceptually route to.

  listeners:
  - name: tcp_translator_listener
    address:
      socket_address: { address: 0.0.0.0, port_value: 10000 }
    filter_chains:
    - filters:
      - name: envoy.filters.network.tcp_proxy
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy
          stat_prefix: tcp_lua
          cluster: legacy_cpp_service_cluster
          # Here we inject our Lua filter for transformation
          downstream_filters:
          - name: envoy.filters.network.lua
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.network.lua.v3.Lua
              inline_code: |
                -- Called when data is received from the downstream (BFF service).
                function envoy_on_downstream_data(downstream_buffer, end_of_stream)
                  -- This simplistic example assumes gRPC request comes as a simple string.
                  -- A real-world gRPC to binary would need a more robust deserializer, possibly via a gRPC-JSON transcoder first.
                  -- For this demo, let's assume the BFF sends a simple 4-byte string for simplicity of the Lua script.
                  local ticker = downstream_buffer:toString()
                  -- Forward the raw ticker to the upstream C++ service.
                  downstream_buffer:drain(downstream_buffer:length())
                  downstream_buffer:add(ticker)
                  
                  -- This is a key part. We inject the transformed data into the upstream connection.
                  local connection = coroutine.yield()
                  connection:write(downstream_buffer)
                end

                -- Called when data is received from the upstream (C++ service).
                function envoy_on_upstream_data(upstream_buffer, end_of_stream)
                  -- The C++ service sends back 20 bytes: 4 (ticker) + 8 (price) + 8 (volume)
                  if upstream_buffer:length() >= 20 then
                    local response_bytes = upstream_buffer:getBytes(0, 20)
                    
                    -- In a real scenario, you'd convert this binary data back to a Protobuf message
                    -- for a proper gRPC response. Here we'll just forward it for demonstration.
                    -- This is a simplification; proper gRPC would require constructing HTTP/2 frames.
                    -- A more practical approach would be using Envoy's gRPC-JSON transcoder
                    -- and then having a simpler adapter for JSON to TCP binary.
                    -- But this shows the raw power of Lua filters.

                    -- For now, we just pass the raw bytes back to the BFF, which will have to decode it.
                    -- This highlights a limitation of this simple approach.
                    local connection = coroutine.yield()
                    connection:write(upstream_buffer)
                    upstream_buffer:drain(20)
                  end
                end

注意: 上述Envoy配置和Lua脚本为了演示目的被简化了。在生产环境中,从一个完整的gRPC请求中提取数据并构建一个gRPC响应比示例中要复杂得多。通常会组合使用gRPC-JSON转码器和Lua过滤器,或者使用WebAssembly (WASM) 过滤器来获得更好的性能和更强的类型处理能力。这里的核心是展示Envoy具备在网络层进行协议转换的能力。

4. ASP.NET Core BFF (Backend for Frontend)

这个服务使用Grpc.Net.Client与Envoy通信,并使用tRPC的C#实现向Vue.js前端暴露一个类型安全的API。

Program.cs

using Grpc.Net.Client;
using trpc.dotnet.core;
using trpc.dotnet.aspnetcore;
using System.Runtime.InteropServices;
using System.Text;

var builder = WebApplication.CreateBuilder(args);

// Setup tRPC
builder.Services.AddTRPC(new TRPCConfiguration
{
    // Configuration here if needed
});

// Register the tRPC router
builder.Services.AddSingleton<AppRouter>();

// Setup gRPC client to talk to Envoy
builder.Services.AddSingleton(services => {
    // The gRPC call will go to Envoy's listener
    var channel = GrpcChannel.ForAddress("http://localhost:8081");
    // This is a simplified way to create a client for a service that's conceptually defined in proto
    // but its actual invocation is custom. We won't use the generated client directly,
    // as the protocol translation is highly customized.
    return channel;
});

var app = builder.Build();

// Enable tRPC middleware
app.UseTRPC("/trpc", app.Services.GetRequiredService<AppRouter>());

app.Run();

// Define tRPC Router
public class AppRouter : TRPCRouter
{
    public AppRouter(GrpcChannel channel)
    {
        // For simplicity, we create a raw TCP client to talk to the Lua filter listener.
        // This bypasses gRPC framing complexity for the demo.
        var tcpClient = new System.Net.Sockets.TcpClient("127.0.0.1", 10000);
        var stream = tcpClient.GetStream();

        // Define the `getMarketData` procedure
        Procedure("getMarketData",
            // Input validation (e.g., using Zod syntax if a library is used)
            TRPC.Input<string>(), 
            // The resolver function
            async (ticker, context) => {
                Console.WriteLine($"BFF received request for: {ticker}");
                
                // 1. Send request via Envoy's TCP Lua filter
                var requestBytes = Encoding.ASCII.GetBytes(ticker.PadRight(4).Substring(0, 4));
                await stream.WriteAsync(requestBytes, 0, requestBytes.Length);

                // 2. Receive and parse the binary response from Envoy
                var responseBuffer = new byte[20];
                var bytesRead = await stream.ReadAsync(responseBuffer, 0, responseBuffer.Length);

                if (bytesRead < 20) {
                    throw new Exception("Incomplete response from legacy service");
                }

                // 3. Unpack the binary data
                var responseTicker = Encoding.ASCII.GetString(responseBuffer, 0, 4);
                var price = BitConverter.ToDouble(responseBuffer, 4);
                var volume = BitConverter.ToUInt64(responseBuffer, 12);

                // 4. Return a structured object
                return new {
                    Ticker = responseTicker,
                    Price = price,
                    Volume = volume
                };
            }
        );
    }
}

5. Vue.js 前端与 tRPC 客户端

前端代码展示了如何利用tRPC客户端以完全类型安全的方式调用后端API。

main.ts (Setup tRPC client)

import { createTRPCProxyClient, httpBatchLink } from '@trpc/client';
import type { AppRouter } from '../../path/to/backend/AppRouter'; // This import is conceptual for type safety

export const trpc = createTRPCProxyClient<AppRouter>({
  links: [
    httpBatchLink({
      url: 'http://localhost:5000/trpc', // URL of your ASP.NET Core BFF
    }),
  ],
});

MarketData.vue

<template>
  <div>
    <h1>Legacy Market Data</h1>
    <div v-if="isLoading">Loading...</div>
    <div v-if="error">Error: {{ error.message }}</div>
    <div v-if="data">
      <h2>{{ data.Ticker }}</h2>
      <p>Price: ${{ data.Price.toFixed(2) }}</p>
      <p>Volume: {{ data.Volume.toLocaleString() }}</p>
    </div>
    <input v-model="ticker" @keyup.enter="refetch" placeholder="Enter ticker (e.g., GOOG)" />
    <button @click="refetch">Fetch</button>
  </div>
</template>

<script setup lang="ts">
import { ref, onMounted } from 'vue';
import { trpc } from '../trpc';

const ticker = ref('GOOG');
const data = ref<Awaited<ReturnType<typeof trpc.getMarketData.query>> | null>(null);
const isLoading = ref(false);
const error = ref<Error | null>(null);

const fetchData = async () => {
  isLoading.value = true;
  error.value = null;
  try {
    // The magic of tRPC: fully typed, feels like calling a local function.
    // `data.value.Price` will have type `number`.
    const result = await trpc.getMarketData.query(ticker.value);
    data.value = result;
  } catch (err) {
    error.value = err as Error;
  } finally {
    isLoading.value = false;
  }
};

const refetch = () => {
  fetchData();
}

onMounted(() => {
  fetchData();
});
</script>

在这个前端组件中,trpc.getMarketData.query的调用和返回结果都是完全类型安全的。如果后端AppRouter的返回值结构发生变化,TypeScript会在编译时立刻报错,这正是tRPC的核心价值。

架构的扩展性与局限性

这个基于Envoy的异构集成架构为我们带来了显著的优势。如果未来决定用Go或Rust重写这个C++服务,我们只需要让新服务实现相同的TCP协议,然后在Envoy配置中修改上游集群的地址即可。整个BFF和前端代码库无需任何改动。我们也可以在Envoy中增加速率限制、认证策略等,而无需触碰任何一个业务服务。

然而,这个方案并非没有缺点。

  • 调试复杂性: 协议转换逻辑位于Envoy的Lua脚本中,调试起来比调试应用程序代码要困难得多。日志和指标是定位问题的关键。
  • 性能开销: Lua脚本是单线程执行的,对于极端的高并发、低延迟场景可能会成为瓶颈。在这种情况下,使用C++或Rust编写的WebAssembly (WASM) 过滤器是性能更好的选择。
  • 运维成本: 维护一套Envoy配置和服务网格本身需要专门的知识和工具,对于小团队来说,这是一个不小的学习曲线和运维负担。

最终,技术选型总是在特定约束下的权衡。对于需要逐步现代化、且对系统的韧性和可观测性有较高要求的遗留系统,利用服务网格作为解耦和治理的中间层,是一种非常务实且强大的架构策略。


  目录