定义问题:无法触碰的高性能C++核心
我们面临一个典型的技术债场景。系统核心是一个用C++编写的、基于自定义TCP二进制协议的高性能计算服务。它稳定、高效,但同时也成了一个技术孤岛:没有文档,原始开发者早已离职,任何对其代码的改动都意味着巨大的风险与回归测试成本。
业务需求是为该服务开发一个新的Web前端界面,提供实时数据展示和控制功能。技术团队决定采用现代技术栈:前端使用Vue.js,后端采用ASP.NET Core构建BFF(Backend for Frontend),API层则选用tRPC以获取端到端的类型安全。
核心矛盾点在于:如何让基于HTTP和JSON的现代Web技术栈,与这个基于自定义TCP二进制协议的遗留C++服务安全、高效地通信,并且整个架构必须具备良好的可观测性、韧性和未来的可扩展性。
方案A:直接构建专用适配器服务
第一个进入脑海的方案是创建一个专门的适配器服务。这个服务,同样用ASP.NET Core实现,其核心职责是双向协议转换。
graph TD
A[Vue.js Frontend] -- tRPC/HTTP --> B[ASP.NET Core BFF];
B -- REST/HTTP --> C[Adapter Service];
C -- Custom TCP --> D[Legacy C++ Service];
实现思路:
- 在Adapter Service中,硬编码实现对C++服务私有TCP协议的解析与封装逻辑。
- Adapter Service对内暴露标准的RESTful API或gRPC接口。
- BFF服务通过调用Adapter Service的API来间接与C++服务交互。
优势分析:
- 逻辑集中: 所有丑陋的、非标的协议处理逻辑都被封装在
Adapter Service内部,BFF层可以保持干净。 - 快速实现: 对于一个简单的场景,这可能是最快完成任务的方式。
劣势分析:
- 强耦合: Adapter Service与C++服务的实现细节(如IP、端口、协议格式)紧密耦合。如果C++服务需要迁移或进行网络策略变更,Adapter Service必须修改并重新部署。
- 缺乏韧性: 服务发现、负载均衡、熔断、重试、超时等网络治理能力都需要在Adapter Service中手动编码实现。这是一项巨大的工作量,且容易出错。
- 可观测性黑洞: 从BFF到C++服务的调用链路将在此中断。我们无法轻易获得这部分通信的延迟、成功率等关键指标。
- 职责不清: Adapter Service同时承担了协议转换和网络通信策略的职责,违反了单一职责原则,随着时间推移会变得越来越难以维护。
在真实项目中,这种方案通常会演变成一个新的、难以维护的“胶水”单体,我们很快就否决了它。
方案B:引入服务网格进行协议转换与治理
该方案的核心思想是将网络通信逻辑从业务应用中剥离出来,下沉到基础设施层,由服务网格(Service Mesh)的边车代理(Sidecar Proxy)来处理。我们选择Envoy Proxy。
graph TD
subgraph Frontend
A[Vue.js App]
end
subgraph BFF Server
B[ASP.NET Core BFF]
end
subgraph Legacy Server
E[Legacy C++ Service]
end
subgraph Envoy Sidecars
C[Envoy Proxy for BFF]
D[Envoy Proxy for Legacy]
end
A -- tRPC/HTTP --> B;
B -- gRPC (to localhost) --> C;
C -- mTLS --> D;
D -- Custom TCP (from localhost) --> E;
在这个架构中,Envoy承担了关键角色:
- 协议转换: BFF服务认为自己是在与一个标准的gRPC服务通信(通过本地的Envoy)。BFF侧的Envoy接收到gRPC请求后,通过内置的过滤器(例如Lua Filter)将其动态转换为C++服务能理解的二进制TCP流。
- 网络治理: 所有的重试、超时、熔断、负载均衡策略都在Envoy的配置文件中声明式地定义,无需在ASP.NET Core中编写任何代码。
- 可观测性: Envoy原生支持生成详细的Metrics、Logging和Tracing数据,可以无缝对接到Prometheus、ELK和Jaeger等系统中,实现全链路的可观测性。
- 安全: 服务间可以通过Envoy轻松实现mTLS加密通信,即使它们本身不支持TLS。
决策理由:
尽管引入Envoy增加了部署的复杂性,但它将网络问题与业务问题彻底解耦。这种架构的长期可维护性、扩展性和韧性远超方案A。它允许我们以一种对业务代码“无感”的方式,逐步现代化我们的基础设施。这是一个更符合长期主义的工程决策。
核心实现概览
我们将构建一个可运行的最小化示例来展示方案B的核心组件。假设C++服务提供一个GetMarketData功能,接收一个4字节的股票代码(如"GOOG"),返回一个结构化的二进制数据包。
1. 遗留C++服务 (Legacy C++ Service)
这是一个使用Boost.Asio实现的简单TCP服务器,它监听在9090端口,处理我们的自定义协议。
协议定义:
- 请求:
[4-byte Ticker Symbol](e.g., ‘G’, ‘O’, ‘O’, ‘G’) - 响应:
[4-byte Ticker Symbol][8-byte Price (double)][8-byte Volume (uint64_t)]
legacy_server.cpp
#include <iostream>
#include <string>
#include <boost/asio.hpp>
#include <boost/bind/bind.hpp>
using boost::asio::ip::tcp;
// A simple representation of our binary response structure
struct MarketData {
char ticker[4];
double price;
uint64_t volume;
};
class TcpSession : public std::enable_shared_from_this<TcpSession> {
public:
TcpSession(tcp::socket socket) : socket_(std::move(socket)) {}
void start() {
do_read();
}
private:
void do_read() {
auto self(shared_from_this());
boost::asio::async_read(socket_, boost::asio::buffer(read_msg_, 4),
[this, self](boost::system::error_code ec, std::size_t length) {
if (!ec) {
// In a real system, you would look up data based on the ticker.
// Here, we just generate some dummy data.
MarketData response_data;
memcpy(response_data.ticker, read_msg_, 4);
response_data.price = 175.50 + (rand() % 100) / 100.0;
response_data.volume = 1000000 + (rand() % 50000);
std::string ticker_str(read_msg_, 4);
std::cout << "Received request for: " << ticker_str << ". Responding..." << std::endl;
do_write(response_data);
} else {
std::cerr << "Read error: " << ec.message() << std::endl;
}
});
}
void do_write(const MarketData& data) {
auto self(shared_from_this());
boost::asio::async_write(socket_, boost::asio::buffer(&data, sizeof(MarketData)),
[this, self](boost::system::error_code ec, std::size_t /*length*/) {
if (!ec) {
// After writing, wait for the next request on the same connection.
do_read();
} else {
std::cerr << "Write error: " << ec.message() << std::endl;
}
});
}
tcp::socket socket_;
char read_msg_[4];
};
class TcpServer {
public:
TcpServer(boost::asio::io_context& io_context, short port)
: acceptor_(io_context, tcp::endpoint(tcp::v4(), port)) {
do_accept();
}
private:
void do_accept() {
acceptor_.async_accept(
[this](boost::system::error_code ec, tcp::socket socket) {
if (!ec) {
std::cout << "Accepted connection from " << socket.remote_endpoint() << std::endl;
std::make_shared<TcpSession>(std::move(socket))->start();
}
do_accept();
});
}
tcp::acceptor acceptor_;
};
int main() {
try {
boost::asio::io_context io_context;
TcpServer server(io_context, 9090);
std::cout << "Legacy C++ server listening on port 9090..." << std::endl;
io_context.run();
} catch (std::exception& e) {
std::cerr << "Exception: " << e.what() << "\n";
}
return 0;
}
编译运行: g++ legacy_server.cpp -o server -lboost_system -lboost_thread -lpthread && ./server
2. Protobuf与gRPC定义
我们定义一个gRPC服务接口,这将是ASP.NET Core BFF与Envoy之间的通信契约。
marketdata.proto
syntax = "proto3";
package marketdata;
service MarketDataService {
rpc GetMarketData(MarketDataRequest) returns (MarketDataResponse);
}
message MarketDataRequest {
string ticker = 1;
}
message MarketDataResponse {
string ticker = 1;
double price = 2;
uint64 volume = 3;
}
3. Envoy配置与Lua协议转换脚本
这是整个架构的核心。envoy.yaml配置了一个监听8081端口的gRPC服务,并通过一个Lua过滤器将请求转换为TCP二进制流,发送到上游的C++服务(legacy_cpp_service)。
envoy.yaml
static_resources:
listeners:
- name: grpc_listener
address:
socket_address: { address: 0.0.0.0, port_value: 8081 }
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: grpc_http
codec_type: AUTO
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match: { prefix: "/" }
route: { cluster: translation_cluster }
http_filters:
- name: envoy.filters.http.grpc_web
typed_config: {}
- name: envoy.filters.http.router
typed_config: {}
clusters:
- name: translation_cluster
connect_timeout: 5s
type: STATIC
load_assignment:
cluster_name: translation_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address: { address: 127.0.0.1, port_value: 10000 }
# This is a TCP proxy cluster with an inline Lua script for protocol translation
- name: legacy_cpp_service_cluster
connect_timeout: 5s
type: STATIC
load_assignment:
cluster_name: legacy_cpp_service_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address: { address: host.docker.internal, port_value: 9090 } # For Docker on Mac/Windows
# We'll create a separate listener for the TCP part with the Lua filter.
# The gRPC service will call this listener. A better way in production might be more complex filter chains.
# For simplicity here, we create a TCP listener that our BFF's Envoy would conceptually route to.
listeners:
- name: tcp_translator_listener
address:
socket_address: { address: 0.0.0.0, port_value: 10000 }
filter_chains:
- filters:
- name: envoy.filters.network.tcp_proxy
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy
stat_prefix: tcp_lua
cluster: legacy_cpp_service_cluster
# Here we inject our Lua filter for transformation
downstream_filters:
- name: envoy.filters.network.lua
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.lua.v3.Lua
inline_code: |
-- Called when data is received from the downstream (BFF service).
function envoy_on_downstream_data(downstream_buffer, end_of_stream)
-- This simplistic example assumes gRPC request comes as a simple string.
-- A real-world gRPC to binary would need a more robust deserializer, possibly via a gRPC-JSON transcoder first.
-- For this demo, let's assume the BFF sends a simple 4-byte string for simplicity of the Lua script.
local ticker = downstream_buffer:toString()
-- Forward the raw ticker to the upstream C++ service.
downstream_buffer:drain(downstream_buffer:length())
downstream_buffer:add(ticker)
-- This is a key part. We inject the transformed data into the upstream connection.
local connection = coroutine.yield()
connection:write(downstream_buffer)
end
-- Called when data is received from the upstream (C++ service).
function envoy_on_upstream_data(upstream_buffer, end_of_stream)
-- The C++ service sends back 20 bytes: 4 (ticker) + 8 (price) + 8 (volume)
if upstream_buffer:length() >= 20 then
local response_bytes = upstream_buffer:getBytes(0, 20)
-- In a real scenario, you'd convert this binary data back to a Protobuf message
-- for a proper gRPC response. Here we'll just forward it for demonstration.
-- This is a simplification; proper gRPC would require constructing HTTP/2 frames.
-- A more practical approach would be using Envoy's gRPC-JSON transcoder
-- and then having a simpler adapter for JSON to TCP binary.
-- But this shows the raw power of Lua filters.
-- For now, we just pass the raw bytes back to the BFF, which will have to decode it.
-- This highlights a limitation of this simple approach.
local connection = coroutine.yield()
connection:write(upstream_buffer)
upstream_buffer:drain(20)
end
end
注意: 上述Envoy配置和Lua脚本为了演示目的被简化了。在生产环境中,从一个完整的gRPC请求中提取数据并构建一个gRPC响应比示例中要复杂得多。通常会组合使用gRPC-JSON转码器和Lua过滤器,或者使用WebAssembly (WASM) 过滤器来获得更好的性能和更强的类型处理能力。这里的核心是展示Envoy具备在网络层进行协议转换的能力。
4. ASP.NET Core BFF (Backend for Frontend)
这个服务使用Grpc.Net.Client与Envoy通信,并使用tRPC的C#实现向Vue.js前端暴露一个类型安全的API。
Program.cs
using Grpc.Net.Client;
using trpc.dotnet.core;
using trpc.dotnet.aspnetcore;
using System.Runtime.InteropServices;
using System.Text;
var builder = WebApplication.CreateBuilder(args);
// Setup tRPC
builder.Services.AddTRPC(new TRPCConfiguration
{
// Configuration here if needed
});
// Register the tRPC router
builder.Services.AddSingleton<AppRouter>();
// Setup gRPC client to talk to Envoy
builder.Services.AddSingleton(services => {
// The gRPC call will go to Envoy's listener
var channel = GrpcChannel.ForAddress("http://localhost:8081");
// This is a simplified way to create a client for a service that's conceptually defined in proto
// but its actual invocation is custom. We won't use the generated client directly,
// as the protocol translation is highly customized.
return channel;
});
var app = builder.Build();
// Enable tRPC middleware
app.UseTRPC("/trpc", app.Services.GetRequiredService<AppRouter>());
app.Run();
// Define tRPC Router
public class AppRouter : TRPCRouter
{
public AppRouter(GrpcChannel channel)
{
// For simplicity, we create a raw TCP client to talk to the Lua filter listener.
// This bypasses gRPC framing complexity for the demo.
var tcpClient = new System.Net.Sockets.TcpClient("127.0.0.1", 10000);
var stream = tcpClient.GetStream();
// Define the `getMarketData` procedure
Procedure("getMarketData",
// Input validation (e.g., using Zod syntax if a library is used)
TRPC.Input<string>(),
// The resolver function
async (ticker, context) => {
Console.WriteLine($"BFF received request for: {ticker}");
// 1. Send request via Envoy's TCP Lua filter
var requestBytes = Encoding.ASCII.GetBytes(ticker.PadRight(4).Substring(0, 4));
await stream.WriteAsync(requestBytes, 0, requestBytes.Length);
// 2. Receive and parse the binary response from Envoy
var responseBuffer = new byte[20];
var bytesRead = await stream.ReadAsync(responseBuffer, 0, responseBuffer.Length);
if (bytesRead < 20) {
throw new Exception("Incomplete response from legacy service");
}
// 3. Unpack the binary data
var responseTicker = Encoding.ASCII.GetString(responseBuffer, 0, 4);
var price = BitConverter.ToDouble(responseBuffer, 4);
var volume = BitConverter.ToUInt64(responseBuffer, 12);
// 4. Return a structured object
return new {
Ticker = responseTicker,
Price = price,
Volume = volume
};
}
);
}
}
5. Vue.js 前端与 tRPC 客户端
前端代码展示了如何利用tRPC客户端以完全类型安全的方式调用后端API。
main.ts (Setup tRPC client)
import { createTRPCProxyClient, httpBatchLink } from '@trpc/client';
import type { AppRouter } from '../../path/to/backend/AppRouter'; // This import is conceptual for type safety
export const trpc = createTRPCProxyClient<AppRouter>({
links: [
httpBatchLink({
url: 'http://localhost:5000/trpc', // URL of your ASP.NET Core BFF
}),
],
});
MarketData.vue
<template>
<div>
<h1>Legacy Market Data</h1>
<div v-if="isLoading">Loading...</div>
<div v-if="error">Error: {{ error.message }}</div>
<div v-if="data">
<h2>{{ data.Ticker }}</h2>
<p>Price: ${{ data.Price.toFixed(2) }}</p>
<p>Volume: {{ data.Volume.toLocaleString() }}</p>
</div>
<input v-model="ticker" @keyup.enter="refetch" placeholder="Enter ticker (e.g., GOOG)" />
<button @click="refetch">Fetch</button>
</div>
</template>
<script setup lang="ts">
import { ref, onMounted } from 'vue';
import { trpc } from '../trpc';
const ticker = ref('GOOG');
const data = ref<Awaited<ReturnType<typeof trpc.getMarketData.query>> | null>(null);
const isLoading = ref(false);
const error = ref<Error | null>(null);
const fetchData = async () => {
isLoading.value = true;
error.value = null;
try {
// The magic of tRPC: fully typed, feels like calling a local function.
// `data.value.Price` will have type `number`.
const result = await trpc.getMarketData.query(ticker.value);
data.value = result;
} catch (err) {
error.value = err as Error;
} finally {
isLoading.value = false;
}
};
const refetch = () => {
fetchData();
}
onMounted(() => {
fetchData();
});
</script>
在这个前端组件中,trpc.getMarketData.query的调用和返回结果都是完全类型安全的。如果后端AppRouter的返回值结构发生变化,TypeScript会在编译时立刻报错,这正是tRPC的核心价值。
架构的扩展性与局限性
这个基于Envoy的异构集成架构为我们带来了显著的优势。如果未来决定用Go或Rust重写这个C++服务,我们只需要让新服务实现相同的TCP协议,然后在Envoy配置中修改上游集群的地址即可。整个BFF和前端代码库无需任何改动。我们也可以在Envoy中增加速率限制、认证策略等,而无需触碰任何一个业务服务。
然而,这个方案并非没有缺点。
- 调试复杂性: 协议转换逻辑位于Envoy的Lua脚本中,调试起来比调试应用程序代码要困难得多。日志和指标是定位问题的关键。
- 性能开销: Lua脚本是单线程执行的,对于极端的高并发、低延迟场景可能会成为瓶颈。在这种情况下,使用C++或Rust编写的WebAssembly (WASM) 过滤器是性能更好的选择。
- 运维成本: 维护一套Envoy配置和服务网格本身需要专门的知识和工具,对于小团队来说,这是一个不小的学习曲线和运维负担。
最终,技术选型总是在特定约束下的权衡。对于需要逐步现代化、且对系统的韧性和可观测性有较高要求的遗留系统,利用服务网格作为解耦和治理的中间层,是一种非常务实且强大的架构策略。