Capturing video from network cameras, part 1

Network cameras are gradually replacing analog, although they are now much more expensive. Network have some obvious advantages:

no need for a separate registrar or capture board;
noise immunity;
simple integration into an existing network;
no distance limit;
availability of high resolution cameras;
viewing the camera directly from the camera itself by http;
the presence of various settings;
and etc.

We are interested in a method of obtaining images from such cameras, for this we need to know, and how do they transmit them at all? Fortunately for us, cameras use existing standards, and not something that a Chinese developer would think of. The overwhelming majority of cameras use one or several video transmission methods; these are mainly Motion JPEG over HTTP, Motion JPEG over RTSP or H264 over RTSP. Also, many cameras can transmit sound, but it does not interest us now.

In this article I will consider these methods of transferring images from network cameras, and also give an example of capturing such images all on the same Python.

MJPEG over HTTP

The easiest way to transfer a picture is MJPEG over HTTP. In this case, frames are sent to finished JPEG files through special delimiters. For such cases, a special MIME type multipart was developed. It has several subtypes, we are interested in mixed and x-mixed-replace. There are practically no differences between them, we will process them in exactly the same way. The difference between them is semantic: mixed simply indicates a document consisting of several parts, these parts can be independent or can be combined; and the type of x-mixed-replace directly indicates that each next part should replace the previous one and should be processed as an update of some presentation. The letter "x" at the beginning of the name indicates that this type is experimental, but nevertheless it is used to the full.

In the HTTP header, the MIME type is specified in the Content-Type parameter:
')

 Content-Type: multipart/mixed; boundary="some_boundary"

 Content-Type: multipart/x-mixed-replace; boundary=other_boundary

These types have a required parameter, the boundary parameter, which indicates how text parts of the document are separated. To the separator are also added two hyphens to its beginning. It is important that this separator is not met in the message itself, if its size is not indicated in the Content-Size .

To understand the structure of the transfer of MJPEG over HTTP, just look at the following example:

 HTTP/1.0 200 OK Connection: close Server: MJPG-Streamer/0.2 Cache-Control: no-store, no-cache, must-revalidate, pre-check=0, post-check=0, max-age=0 Pragma: no-cache Expires: Mon, 3 Jan 2000 12:34:56 GMT Content-Type: multipart/x-mixed-replace;boundary=boundarydonotcross --boundarydonotcross Content-Type: image/jpeg Content-Length: 23950 X-Timestamp: 0.000000 %Binary JPEG% --boundarydonotcross Content-Type: image/jpeg Content-Length: 24756 X-Timestamp: 0.000000 %Binary JPEG% --boundarydonotcross Content-Type: image/jpeg Content-Length: 23950 X-Timestamp: 0.000000 %Binary JPEG%

To analyze how your network camera transmits images, you can use either a sniffer (I use WireShark ) or connect to the camera via telnet, for example:

 $ telnet 192.168.0.50 80 Trying 192.168.0.50.. Connected to 192.168.0.50. Escape character is '^]'. GET /jpeg HTTP/1.1

After the GET line, you must also send one empty line to indicate that your header has ended. And instead of "/ jpeg" you need to write the request for which your camera gives MJPEG.

In the example above, %Binary JPEG% corresponds to the information of interest to us - a JPEG image. We also need to isolate it from the stream.

As you can see at the beginning there is a standard HTTP header with a description of the document. Connection can be both close and keep-alive , in our case it does not matter. From the title we just need two lines: the first with the status of 200 OK, which tells us that everything is fine, now a bird will fly out; and Content-Type to define the parameter boundary .

After the HTTP header (after the empty string), the body of the transmitted document consists of many parts. Each part starts with a separator, has its own title and its own document body after the empty line. Content-Type: image/jpeg tells us that we actually receive JPEG images, the Content-Length the size of the current frame in bytes (in the first part it is 23950 bytes), and the timestamp of the current frame can be transmitted to the X-Timestamp these goals, the current time of the computer at the time of reception of the frame, but the X-Timestamp will be more accurate, as the network can have a different impact on the frame rate.

Python MJPEG over HTTP Client

Despite the simple format of transferring images, their reception can be implemented by different methods. Also plays the role of TCP segmentation, or rather the approach to its processing. The fact is that the maximum size of a transmitted message ( MTU ) over Ethernet cannot exceed 1500 bytes and we process the data every time such a packet comes to us. If you analyze the information immediately after it arrives, it may happen that the data will not be complete and the parser will not be able to cope with its task. And if you buffer the flow of incoming data, it is not the best effect on performance and resource intensity. It would be safer to use buffering and start analyzing only when enough information has been accumulated (first you need to read before '\r\n\r\n' to select the header, and then either until you see two delimiters in the stream, or read up to an empty line, determine the size of the image and count the number of bytes). But I used the method of processing information immediately after it arrived.

Client code consists of two files: main.py and http_mjpeg_client.py . In the first, the application is launched, and in the second, work with the camera is implemented. Immediately and bring them here.
main.py

 from twisted.internet import reactor from http_mjpeg_client import MJPEGFactory def processImage(img): 'This function is invoked by the MJPEG Client protocol' # Process image # Just save it as a file in this example f = open('frame.jpg', 'wb') f.write(img) f.close() def main(): print 'Python M-JPEG Over HTTP Client 0.1' # Define connection parameters, login and password are optional. config = {'request': '/mjpeg', 'login': 'admin', 'password': 'admin', 'ip': '127.0.0.1', 'port': 8080, 'callback': processImage} # Make a connection reactor.connectTCP(config['ip'], config['port'], MJPEGFactory(config)) reactor.run() print 'Python M-JPEG Client stopped.' # this only runs if the module was *not* imported if __name__ == '__main__': main()

http_mjpeg_client.py

 from twisted.internet.protocol import Protocol, ClientFactory from base64 import b64encode import re debug = 1 class MJPEGClient(Protocol): def __init__(self): # A place for configuration parameters self.config = {} # I we are connected to a web server self.isConnected = False # The boundary in multipart stream self.boundary = '' # Actual image data goes here self.img = '' # Size of the image frame being downloaded self.next_img_size = 0 # Indicates that currently parsing a header self.isHeader = False def connectionMade(self): # Implement basic authorization if self.config['login']: authstring = 'Authorization: Basic ' + b64encode(self.config['login']+':'+self.config['password']) + '\r\n' else: authstring = '' # Form proper HTTP request with header to_send = 'GET ' + self.config['request'] + ' HTTP/1.1\r\n' + \ authstring + \ 'User-Agent: Python M-JPEG Client\r\n' + \ 'Keep-Alive: 300\r\n' + \ 'Connection: keep-alive\r\n\r\n' # Send it self.transport.write(to_send) if debug: print 'We say:\n', to_send def dataReceived(self, data): if debug: print 'Server said:\n', len(data), 'bytes of data.' if not self.isConnected: # Response header goes before empty line data_sp = data.strip().split('\r\n\r\n', 1) header = data_sp[0].splitlines() # Parse header for line in header: if line.endswith('200 OK'): # Connection went fine self.isConnected = True if debug: print 'Connected' if line.startswith('Content-Type: multipart'): # Got multipart r = re.search(r'boundary="?(.*)"?', line) self.boundary = r.group(1) # Extract boundary if debug: print 'Got boundary:', self.boundary # If we got more data, find a JPEG there if len(data_sp) == 2: self.findJPEG(data_sp[1]) else: # If connection is alredy made find a JPEG right away self.findJPEG(data) def findJPEG(self, data): hasMoreThanHeader = False # If we know next image size, than image header is already parsed if not self.next_img_size: # Otherwise it should be a header first for line in data.splitlines(): if line == '--'+self.boundary: self.isHeader = True if debug: print 'Got frame header' elif line == '': if self.isHeader: # If we might have more data after a header in a buffer hasMoreThanHeader = True self.isHeader = False elif self.isHeader: # Here we can parse all the header information # But we are really interested only in one if line.startswith('Content-Length:'): self.next_img_size = int(line.split(' ')[1]) if debug: print 'Next frame size:', self.next_img_size else: # How many bytes left to read remains = self.next_img_size - len(self.img) self.img += data[:remains] # We got the whole image if len(self.img) == self.next_img_size: if debug: print 'Got a frame!' # Run a callback function self.config['callback'](self.img) # Reset variables self.img = '' self.next_img_size = 0 # If something left in a buffer if data[remains:]: self.findJPEG(data[remains:]) if hasMoreThanHeader: data_sp = data.split('\r\n\r\n', 1) # If there is something after a header in a buffer if len(data_sp) == 2: self.findJPEG(data_sp[1]) def connectionLost(self, reason): print 'Connection lost, reconnecting' self.isConnected = False self.img = '' self.next_img_size = 0 self.isHeader = 0 self.boundary = '' class MJPEGFactory(ClientFactory): def __init__(self, config): self.protocol = MJPEGClient self.config = config def buildProtocol(self, addr): prot = ClientFactory.buildProtocol(self, addr) # Weird way to pass the config parameters to the protocol prot.config = self.config return prot def clientConnectionLost(self, connector, reason): # Automatic reconnection connector.connect()

In the main file, we define the parameters for connecting to the camera in the config dictionary, launch the Twisted network framework reactor, and process the resulting images in the processImage() function. In this example, each received frame is simply written to the current directory with the name frame.jpg .

I checked the performance with the help of the MJPEG streamer , I started it like this:
./mjpg_streamer -i "./input_testpicture.so" -o "./output_http.so -w ./www"
In this case, the request in the client configuration must be set equal to /?action=stream .
He refused to transfer images from the webcam.

I tried to document the second file well, so that it was easier for the reader to understand how the process of removing images from the stream takes place. In words, the algorithm can be described as follows: when connecting to the camera, first of all we create an HTTP header with reference to it and send it, this is the connectionMade() function. The dataReceived() function is called whenever new data dataReceived() . In it, we check whether the transfer of JPEG data is already established or not. If there is still no, then this means that the HTTP response header of the camera should come to us, we select it using the split('\r\n\r\n', 1) function split('\r\n\r\n', 1) , then sort it out on the shelves, highlighting the necessary parameters (status and boundary ). In other cases, we immediately pass the received data to the findJPEG() function.

In this function, branching also occurs depending on whether we have received the internal header of a JPEG document or not yet. If you haven’t received it, we expect it and parse it; if it is received, then the data is directly a JPEG image and put it into the self.img variable until we receive all self.next_img_size image bytes, and when we receive we call the function passed to us through the parameter configuration callback , and give her just received the image.

The debug parameter can be set to zero to disable the display of output.

Download the source code from this link: Python MJPEG over HTTP Client .

To be continued…

It is useful to read:

Mime
Motion jpeg
HTTP headers list

PS: I decided to split the article into two parts, as a single whole it turns out to be rather voluminous, but I don’t want to overload it for better understanding and more convenience.

Source: https://habr.com/ru/post/115808/

All Articles

Capturing video from network cameras, part 1

MJPEG over HTTP

Python MJPEG over HTTP Client

It is useful to read:

More articles: