Open Sourcing our RTSP Server and video-rs

We are proud to announce that we have decided to open source two formerly
internal Oddity projects that help us read, process and distribute video
streams. Today, we’re open sourcing a Rust library called
video-rs that can read, write,
encode and decode video. video-rs is built on top of ffmpeg. The
library is currently focused on real-time video applications, but we
intend to grow it into a more generic video library. One project we have been
using internally already uses video-rs:
Oddity’s RTSP server. RTSP is a
protocol that is at the heart of all video surveillance. We often found
ourselves needing an RTSP server for redistribution, restreaming, or other
purposes, and decided to build one ourselves. We are open sourcing both
projects hoping to enrich the ecosystem of video libraries in Rust.

Why Video is Important

Oddity.ai is worldwide leader in violence detection, using deep
learning to create a safer world. Ever since we started in 2020, we have
been focused on developing an algorithm that can detect aggression with
the highest possible accuracy, and with the fewest false positives
possible. Building the most accurate AI has always been our primary
goal, and it is what we are most known for. But to make our technology
work in actual, real-world camera surveillance systems, we require
another important component. That component is what we call our “engine
platform”. It is responsible for reading security camera feeds, running
them through our AI model, and sending alarms to video management
systems of our customers. The engine must be stable and robust. It must
run unsupervised for years without maintenance, without failing. Some
of our early installations have been running continuously ever since
they were installed.

Usually, the security cameras themselves are the most error-prone part
of a video surveillance installation. They can be down due to network
issues, for maintenance or they might have broken, often due to the
weather or vandalism. In some cases the camera might go down, or it
could reset itself and revert to a non-compatible profile.
Additionally, from time to time, a camera will appear to be working,
but doesn’t actually produce any video. In all cases, our engine must
log the issue and continue analysis on camera streams that are still
up. Moreover, when the camera is repaired, our engine needs to pick up
where it left off. From experience, we learned that having a well-built
and well-tuned library for video retrieval is key to ensuring the
stability of the engine. Added benefit is that we can use the same
library for many other purposes that come up during experimentation,
model validation, model debugging and video surveillance
implementations.

video-rs

Over time, the part of our engine that handles video input grew into
its own separate component, and then into a library. In many of Oddity’s
projects, video-rs is now a dependency. Furthermore, we
believe it is mature enough to be an open source project as well!

The video-rs library is a high-level wrapper over
ffmpeg. The goal is to make video tasks, such as encoding
or muxing, much easier than it would be if you were using the
ffmpeg API. For example, transcoding a video to H264 is as
easy as this:

use video_rs::{self, Locator, Decoder, Encoder, EncoderSettings};

fn main() {
  let source = Locator::Path(std::path::PathBuf::from("input.mp4"));
  let destination = Locator::Path(std::path::PathBuf::from("output.mp4"));
  let mut decoder = Decoder::new(&source)
    .expect("failed to create decoder");

  let (width, height) = decoder.size();
  let encoder_settings = EncoderSettings::for_h264_yuv420p(width as usize, height as usize, false);
  let mut encoder = Encoder::new(&destination, encoder_settings)
    .expect("failed to create encoder");

  decoder
    .decode_iter()
    .for_each(|decoded| {
      let (timestamp, frame) = decoded
        .expect("failed to decode");
      encoder.encode(&frame, &timestamp)
        .expect("failed to encode");
    });
}

We designed the API to be simple and foolproof. For example, we opted to create
certain encoding presets (like for_h264_yuv420p) that users will be likely to
need. Using video-rs should lead to less code compared to ffmpeg. Of course,
some custom or advanced use cases might still require using native ffmpeg.

The example below shows the power of ndarray combined with video-rs. In just
a couple lines of code, we can draw an animation and produce an MP4 video file
encoded by ffmpeg:

use video_rs::{
  self,
  Locator,
  Encoder,
  EncoderSettings,
  Time,
};

fn main() {
  let width = 800_i32;
  let height = 600_i32;
  let color_bg = [240, 240, 240];
  let color_cube = [197, 0, 0];
  let destination = Locator::Path(std::path::PathBuf::from("output.mp4"));
  
  let encoder_settings = EncoderSettings::for_h264_yuv420p(width as usize, height as usize, false);
  let mut encoder = Encoder::new(&destination, encoder_settings)
    .expect("failed to create encoder");
  
  let duration: Time = std::time::Duration::from_nanos(1_000_000_000 / 60).into();
  let mut position = Time::zero();
  
  let center_x = width / 2;
  let center_y = height / 2;
  for size in 4..520 {
    // Using some Pythagoras magic to draw a circle that grows bigger and bigger!
    let frame = ndarray::Array3::from_shape_fn(
      (height as usize, width as usize, 3),
      |(y, x, c)| {
        let dx = (x as i32 - center_x).abs();
        let dy = (y as i32 - center_y).abs();
        let d = ((dx.pow(2) + dy.pow(2)) as f64).sqrt();
        if d < size.into() { color_cube[c] } else { color_bg[c] }
      });
      
    encoder.encode(&frame, &position)
      .expect("failed to encode frame");
    
    position = position.aligned_with(&duration).add();
  }

  encoder.finish()
    .expect("failed to finish encoding");
}

Your browser does not support the video tag.

Growing red circle produced by above code.

To learn more about video-rs, head over to
GitHub. video-rs is also
published on crates.io. To use it in your
project, add the following to your dependencies in Cargo.toml:

video-rs = "0.1"

To use it with ndarray, enable the corresponding feature like so:

video-rs = { version = "0.1", features = ["ndarray"] }

Oddity’s RTSP Server

Our first use for video-rs was building an in-house RTSP server. We tried a
couple of approaches to building an RTSP server before. Earlier implementations
were in C or C++. Building the RTSP server in Rust turned out to be the perfect choice.
Rust is well-suited for networking programming, low-level enough for
implementing a protocol like RTPS, and high-level enough that we were able to
complete the first prototype in a couple of weeks. Also, Rusts strictness is
particularly useful when building a protocol parser, where even the tiniest bugs
can take hours to debug and solve.

Let’s look at some of the ways we are using oddity-rtsp-server.

Restreaming

Old or cheap surveillance cameras often have strict limits on the number of
RTSP streams, or the maximum outbound bandwidth. Some scenarios require more
consumers than the camera supports. Oddity’s RTSP server can be used to consume
one stream from a security camera, and then produce unlimited streams to any
client that needs it.

To configure such a scenario, download and install Oddity’s RTSP server using
the instructions on GitHub.

This is an example of a configuration file that can be used for
restreaming. In this example, the RTSP server reads a stream from an
RTSP camera with IP 10.0.0.100 and relays the stream to
clients on the path /relayed/camera/1.

server:
  host: 0.0.0.0
  # Note that using port 554 requires elevated permissions!
  port: 554
media:
  - name: "Camera 1"
    path: "/relayed/camera/1"
    kind: stream
    source: "rtsp://10.0.0.100/stream/0"

No matter how many clients connect to the server at the same time, from the
perspective of the camera, there is only a single connection.

Restreaming can be useful in certain network scenarios as well. For example,
some networks are locked down in such a way that RTSP streams cannot get to
their desired destination. For example, an inaccessible server might have bound
itself to port 554, which can’t be reached through the network firewall. In that
case, you might want to configure Oddity’s RTSP server to bind to an accessible
port and relay streams as needed.

Integration Testing

Most video management systems have no method of inserting fake video
streams for testing. We often find ourselves wanting to test our
violence detection system in an end-to-end fashion. One option would be
to load one of our office cameras into the VMS, and then do some
pretend fighting. Of course, we got a bit tired of that at some point.

To get around this, our RTSP server can be configured to play a video on repeat,
and produce an RTSP stream out of it, that looks like a real camera stream to a
VMS. To do this, we use a configuration similar to this:

server:
  host: 0.0.0.0
  port: 554
media:
  - name: "Fake Camera 1"
    path: "/always-violence"
    kind: file
    source: "/data/video/violence_01.mp4"
Debugging

During testing our system, we have run into hundreds of different
camera models. From cheap $100 IP cameras, to $10k military-grade PTZ
cameras with 32x optical zoom. They do not always work as expected.
After inspecting the logs, we often find some quirky or erroneous RTSP
implementation detail that causes availability issues. Reproducing such
issues is notoriously hard. Nowadays, we reimplement the buggy behavior
in our own RTSP server to replicate the behavior of the buggy camera.
Using this strategy, we have been able to fix issues much quicker.

Contributions Welcome!

In the process of developing the best violence detection algorithm, we learned
about the wonderful world of video and all its complexities. We ended up
developing some very useful software to deal with it, and as those projects
matured, open sourcing it felt like the right choice. We hope that by doing this, we
can help enrich the Rust ecosystem, especially in the domain of video.

Oddity still is, and will stay, an AI-company first. There are many
people in the industry that know far more about the domain of video
than we do. Hopefully, by open sourcing our code, we can learn from
those people as well. We are open to contributions!