Forums  > Software  > What stack are you using for Logs/Monitor/Alerts  
Page 1 of 1
Display using:  


Total Posts: 285
Joined: Jan 2015
Posted: 2018-01-08 11:53
I'm curious what other NP'ers who are running automated trading systems are using when it comes to logging, monitoring, and alerts. I'm poking my nose in this topic, since I want to upgrade my current setup to something shinier. I haven't really put much effort on this side of things. Up until now, I pretty much get by dumping output to stdout, piping to log files, then just regularly checking things with grep/sed/awk by shelling into the production machine.

However, I have a baby at home and am doing a lot of trading in a different timezone. So, I'd like to make it easier to step away, plus offload some of the responsibility to a non-technical person on my team. It'd be interesting to hear what other solutions people are using in this area. Particularly any good open source or relatively cheap software that can just be plugged in and turned on. It's hard to do research in this area, since everything's so web-dev focused. Off the top of my head here's a rough outline of what I'm looking at (critique or suggestions definitely welcome):

- Log in application to syslog (instead of stdout)
- Logstash for sync'ing logs from prod to archive
- Nagios to let me know if the server blows up or quoter dies
- Logstash/Splunk to pub/sub trading events from the quoter output
- Pagerduty to blow up my phone in case shit hits the fan
- Some sort of web frontend for easy monitoring: refresh PnL, positions, trades, other strat-specific stats.
- Bonus points if that frontend could also plot intraday PnL, etc. Unfortunately can't really find any good type of project that does this out of the box. Would be nice if Graphite or Kibana could be easily shoehorned into doing this...

Good questions outrank easy answers. -Paul Samuelson


Total Posts: 1024
Joined: Nov 2004
Posted: 2018-01-08 12:47
I would be interested to see proposals on that topic too actually!

As a side note, and this will be my very modest contribution, my experience over the years has told me that the worst always happens silently... somewhere deep inside the systems and stays undiscovered and unlogged until sh... hits the fan. As a result I have been a supporter of jobs always systematically and politely saying: "I am done now".

If you are not living on the edge you are taking up too much space.


Total Posts: 1003
Joined: Jun 2007
Posted: 2018-01-08 13:05
Non-Trading applications, but probably still interesting:
we use the ELK stack a lot. And grafana.

In addition we started to blur the line bewteen data output and logs a bit.

We have some meta data (like counts, averages, medians, standard deviations and ranges of values and of proccessing times, deltas of these values to historical data, "hasFinished" flags with timestamps etc) as structured data, that is written to a database.

This makes it easier and straight forward to create supervisor and sanity check jobs, jobs that create technical reports and dashboards.

Here we use Scala, Python, the database itself and (I am embarrassed) we use jenkins for alerts and e-mailks

Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...


Total Posts: 6
Joined: Oct 2010
Posted: 2018-01-09 12:53
This is what we use for monitoring our infrastructure (Micro$oft .NET and python shop).
Besides an automated trading systems marketplace, we run a traditional online brokerage business. I don't have any reference to compare scales, but we're generating around 100GB logs per day and 15000 realtime metrics.

Everything are open source projects running in auto-pilot, almost no maintentance needed :)


At application level we verbosely log almost everything using log4net. We started saving them into daily rolling text files, but soon we got some disk issues so we created an appender that fires and forget all log traces to rabbitmq.
These traces are consumed by a small process that indexes them into Elastic Search.

We're evaluating to migrate from log4net to serilog to create structured log traces, which are way better to deal with on the monitoring stage


We have used graphite a couple of years. A pain in the ass to install, but super powerful in terms of transform, combine and perform computations on series data.

Again we faced some scalability issues (disk bottleneck), and moved to influxdb (on windows!). Extremely easy to install, and telegraf (influxdb plugin-driven agent for collecting and reporting metrics) is just awesome.

The downside is that innodb is way worst in terms of metric aggregation and computing. We've had a hard time trying to replicate the real-time dashboards we had with same graphite metrics


Grafana. period.

Very easy to install, and super powerful. You can add multiple datasources (we've used graphite, innodb and elastic search) and combine metrics into real-time dashboards. You can visualize time-series, create single panel metrics with alerting colors and much much more

We do use kibana for some manual elastic search log analysis, but once you identify a thing you want to monitor in real-time don't hesitate: grafana


Years ago we developed a home made alerting systems, based on some database flags. I've done some tests using grafana alerting module and it's quite impressive. Out of the box you have email alerting (attaching fancy image charts) , but the great thing is that you have plenty of channel plugins (like telegram, slack and so on).

For telephone/SMS we use traditional SMS providers, but if I had to chose now I'd have a look at twilio

My two cents!


Total Posts: 12
Joined: Jun 2011
Posted: 2018-04-11 13:36
I've been doing some labs lately to move from a similar scenario (ssh into servers and grep) to something more practical.

I'm testing two separate pipelines.
Each trading server runs a filebeat agent that collects FIX logs in realtime and dumps them into a kafka topic as is, line by line.

#Pipeline 1 (Statistics and Monitoring)
Kafka -> Logstash -> ElasticSearch

Logstash parses the FIX logs to json using
At the end I use Kibana for easily searching through logs and Grafana for plotting stuff like orders/sec, rejects/sec, etc.

#Pipeline 2 (PnL and Trade database)
For this I have a custom python service that gets trades from the kafka topic and marketdata from our ticker plant.
The service calculates positions and marks to market in realtime and pushes to a kafka topic.
Another service reads this topic and writes to Postgres.
Then I use Grafana to query periodically and update the PnL on the dashboards.
For actual production workloads this might be a lot of data and I'm looking into the TimescaleDB extension in case Postgres chokes.

For application logs I haven't had the time yet but I want to try something similar to pipeline 1, just writing to ElasticSearch then use for live tail and grep.
Previous Thread :: Next Thread 
Page 1 of 1