What a healthy decision log looks like (and three red flags).

Most crypto trading bots give you a P&L screenshot. Some give you a trade list. A few will show you the rule that fired on the last trade. Almost none will show you the rule that didn't fire on the last hundred — and that absence is the difference between a transparent trading bot you can audit and a black box you can only believe.

A decision log is the full chain: every read the trading agent did, every rule it checked, every decision it took, every order it sent, every fill it got back. Stored in order, stored with the inputs the agent actually saw, replayable at any point in the timeline. It's the trace that turns a non-custodial trading agent into something you can hold accountable rather than something you have to trust.

We've spent a lot of editorial oxygen on Engine arguing that logs matter. This post is about something narrower: how to tell a real log from a polished one. There are four features a healthy decision log makes legible, three patterns that should make you close the tab on whatever you're looking at, and three questions you can ask any vendor that will surface most of the difference in about ninety seconds.

What a healthy log actually shows.

A healthy decision log makes four things legible at the event level, not at the aggregate.

First, every read. Every funding rate the agent fetched, every OI delta it tabulated, every spot tick it used to mark a position. Not a summary like "the agent considered current market conditions" — the literal numbers, with timestamps. A live Engine strategy generates around 4,000 to 8,000 events in a 24-hour day on a quiet market, more during volatility. The vast majority are reads. That number is supposed to be large. A bot whose log only has thirty entries in a day is not running a strategy, it is running a screensaver.

Second, every rule check. When the strategy file says funding_rate < -0.012% over 2 consecutive epochs, a healthy log shows the check happening, with the rule name, the value that was read, and the boolean outcome. Not "matched signal R3" with no further detail — the literal predicate and the literal value the predicate saw. If you can't tie a decision back to a line in the strategy file you wrote, the strategy file isn't doing any work.

Third, every decision, including the do-nothing ones. This is the feature that distinguishes a real log from a marketing log most cleanly. The agent scanned the universe. It found nothing. It said so. The line in the log reads "no qualifying signal, skipping" or "regime gate vetoed: realized vol 87% > 80%." The do-nothing line is the one that proves the discipline of the strategy. We'll come back to this.

Fourth, every order and fill, with the rounding errors visible. Split into child orders if the size demanded it, with IOC or post-only routing as written, with the actual realized slippage in basis points. Engine emits a separate event per child order so the execution shape is visible — if the agent decided 2.4% NAV and the fill came back as 2.39% because the last child got partially filled, the log shows both numbers. Glossing 2.39% as 2.4% in the dashboard is exactly the kind of small lie that compounds into a big one.

The way to test all four at once: ask the agent why it took (or didn't take) a specific position at a specific timestamp, and check whether the answer cites the rule and the values it read. If the answer is "because the market signals were bullish," you are not reading a decision log. You are reading a press release.

Red flag #1: aggregate explanations dressed up as a log.

The first failure mode is the most common, because it photographs well.

You open the bot's history page and you see:

2026-06-22 09:14 · LONG ETH-PERP 0.42 ETH @ $3,847
  "Entered long on bullish funding signal and momentum confirmation."

That sentence isn't a log line. It's a synthesis. There is no way to know, from that sentence alone, whether the agent (a) ran a deterministic rule, evaluated a real funding-rate value, sized the position by a written sizing rule, and reported the result, or (b) hit a "send order" button somewhere on a different surface and asked a language model afterwards to write a plausible-sounding reason.

The shape of a real log line is closer to:

09:14:18 · agent.scan   ETH-PERP funding -0.012% · matches R3
09:14:22 · agent.decide size 2.4% NAV · conviction 0.78
09:14:23 · agent.decide LONG ETH-PERP · stop $3,720
09:14:24 · agent.trade  split 3 child orders · IOC 12bps
09:14:25 · agent.trade  fill 0.420 ETH @ $3,847

Five events, each with a timestamp, each with a kind, each with the literal value the agent had in hand. The "explanation" is reconstructible from the trace — you can ask the agent to summarize later, but the trace is the source of truth, not whatever the model writes for the dashboard. If the explanation is the only artifact, the explanation is the product. That is not a transparent agent. That is a copywriter.

A useful heuristic: scroll a vendor's log page and count the timestamps. If timestamps appear once per trade, you're looking at a list of trades with captions. If they appear several times per trade, with sub-second precision, on events that aren't trades at all, you're looking at something closer to a real trace. The latter is harder to fake. That's most of the point.

Red flag #2: only the winners get logged.

The healthiest log we publish on a live Engine strategy is dominated by do-nothing entries. On a typical day, roughly nine out of ten scans end with "no qualifying signal, skipping." The line costs the agent nothing to emit and the user nothing to skim. It also costs the bot vendor a lot to fake, which is why most of them don't bother.

A bot that only renders filled trades is hiding the most informative line in the trace: the line where the rule didn't fire. That line is what proves the rule was actually evaluated, on data the agent had access to, with the outcome the agent reported. If you can't see it, the strategy is statistically indistinguishable from a strategy that just trades when the operator feels like it.

Watch for the related pattern: bots that show you the last trade and the previous trade with no concept of intervening time. A 23-hour gap between two log entries is not silence. It is either 23 hours of scans not surfaced, or it is the bot doing nothing for 23 hours, and you should be able to tell which. Engine shows you the count of scans in the gap and lets you replay any of them; if the count is zero, the gap is a bug, and that bug is visible too.

A useful sanity check before you trust a bot: pull up its last seven days of public logs and count the ratio of "took an action" lines to "scanned and skipped" lines. On a strategy with a real regime gate and a real signal predicate, you should see at least four "skipped" lines per "acted" line, usually closer to ten. If the ratio is one to one or worse, the strategy is over-fitting in real time and the log is hiding it. If there are no "skipped" lines at all, the strategy doesn't have a discipline. It has a habit.

Red flag #3: explanations that drift between visits.

The third red flag is the one that takes thirty seconds to find and that every bot you would have shrugged at fails immediately.

Open a trade from last week. Note the reason the dashboard gives. Refresh the page. Read it again. If the second explanation is substantively different from the first — same trade, same timestamp, different rationale — the explanation is being generated post-hoc by a language model that looks at the trade record and writes a story. That model is not part of the strategy. It is part of the marketing surface.

A real decision trace is written at decision time, by the same code that read the inputs and applied the rule. It does not change between page loads. You can replay it tomorrow, next month, in a different browser, after the vendor has shipped a new model version, and the explanation is the same sequence of events with the same values. If the reasoning is a function of when you ask, the bot is not reasoning. It is improvising a defense of an action that already happened.

There's a subtler version of this failure that's worth flagging because it slips past most users: dashboards that "summarize" a trade with a freshly-generated paragraph each time you click in, while the underlying event list is fine. Two things to check. One, does the timestamp of the explanation match the timestamp of the decision, or is it the timestamp of your visit? Two, can you export the explanation as part of the trade record, or only as on-screen text? An honest explanation is a stored field, not a re-rendered one.

Engine's decision log is immutable in the same sense your trade history on the venue is immutable: the trace is written when the decision is taken, signed alongside the order, and indexable by timestamp forever after. You can ask the agent "why did you go long here?" in February and get the same answer in August. That property is not glamorous. It is, however, the property that distinguishes a transparent crypto trading agent from one that is good at sounding transparent.

How to test any product in ninety seconds.

There are three asks you can put to any non-custodial trading bot or crypto trading agent that markets itself on auditability. They cover most of the surface above without requiring you to read a paper.

Show me a no-op log line. If they can't, the strategy is probably not running a rule. It is running a feeling.
Show me a decision line that cites a rule name and a value the agent read at decision time. If they can only show you a sentence, the sentence is the product.
Replay an old decision. If the second telling drifts from the first, the telling is being generated, not retrieved.

If a vendor fumbles on any of the three, this isn't an argument worth continuing. You're not asking for anything exotic. You're asking for the four data structures every honest trading agent has to keep anyway: a read trace, a rule evaluation, a decision record, and an execution record. You are asking for what their own code wrote down, in the order it wrote it. The product can be opinionated about how to present those four things. The product cannot be cute about whether to keep them.

The boring conclusion: the test of a trading bot isn't whether it makes money in the screenshot. It's whether you can audit it after it doesn't. The decision log is what makes the audit possible. A bot that won't show you one is telling you what kind of bot it is.