Telegram bots. Downloading files larger than 50mb

Telegram bots allow you to automate many processes. Their use is not limited to one chat, in fact - a bot is just an input-output interface , which in addition to text can also receive and transmit files : images, video, audio, documents ...

For users, the maximum file size is 1.5GB
The bots are limited to only 50mb.

How to bypass this restriction - under the cut.

Telegram API

Once users can upload files up to 1.5GB, that means we can - for this we will create an agent (called not to be confused with bots) who will work in conjunction with our Telegram bot. This will require a separate account and Telegram API.

First we go to https://core.telegram.org and register the application according to the instructions, in the end you should get api_id and api_hash
')

What does the agent do?

A bot cannot upload files larger than 50mb, but if it has the file_id file already uploaded to the Telegram server, then it can forward it. So the algorithm is as follows

An application running on the server through the Bot API generates a file to send
Calls the agent to upload the file to Telegram servers
Receives from agent file_id
Uses the downloaded file.

Code example

The need to download large files appeared when writing @AudioTubeBot - initially the audio file was broken into parts and sent in parts. The functionality of downloading large files was decided to be put into a separate application, which is called via subprocess.check_call

#!/usr/bin/env python3 # -*- coding: utf-8 -*- from telethon import TelegramClient from telethon.tl.types import DocumentAttributeAudio import mimetypes entity = 'AudioTube_bot' #  -    api_id = 1959 api_hash = '88b68d6da53fe68c1c3541bbefc' phone = '+79620181488' client = TelegramClient(entity, api_id, api_hash, update_workers=None, spawn_read_thread=False) client.connect() if not client.is_user_authorized(): # client.send_code_request(phone) #   - ,     FloodWait   client.sign_in(phone, input('Enter code: ')) client.start() def main(argv): file_path = argv[1] file_name = argv[2] chat_id = argv[3] object_id = argv[4] bot_name = argv[5] duration = argv[6] mimetypes.add_type('audio/aac','.aac') mimetypes.add_type('audio/ogg','.ogg') msg = client.send_file( str(bot_name), file_path, caption=str(chat_id + ':' + object_id + ':' + duration), file_name=str(file_name), use_cache=False, part_size_kb=512, attributes=[DocumentAttributeAudio( int(duration), voice=None, title=file_name[:-4], performer='')] ) client.disconnect() return 0 if __name__ == '__main__': import sys main(sys.argv[0:])

Comments:

This is the whole code - Telethon library is used here - when launching, the program sends the path to the file to be sent, the file name, chat_id - for whom this file is intended), the name of the bot that called the agent (for example, I have beta and release bots).

client.send_file

Just upload the file to the server via upload , get the file_id and transfer it to the bot - it will not work, the file_id works only inside the chat in which it was created - so that our bot could send the file to the user via file_id - the agent must send this file to him
- then the bot will receive its file_id for this file and will be able to dispose of it.

caption = str (...) - wat ?!

The agent sends files only to the bot, adding a comment to caption — this is for me:

end user chat_id
track duration
object_id in the database to which you want to bind file_id in order not to reload the file (indexing, optimization and all that)

An example of a call in the bot code

The disk in path_file_mp3 has already saved the file to be loaded, call the subroutine and wait for its completion.

code

 status = subprocess.check_call( "python3.6 audiotubeagent36/main.py " + path_file_mp3 + ' ' + audio_title + '.'+ us_audio_codec + ' ' + str(chat_id) + ' ' + str(pool_object['_id']) + ' ' + config.BOT_NAME + ' ' + str(duration),shell=True)

In the inbox handler, we do something like this.

code

  if message.content_type in ['document','audio']: user_id = message.from_user.id bot_settings = SafeConfigParser() bot_settings.read(config.PATH_SETTINGS_FILE) c_type = message.content_type if functions.check_is_admin(bot_settings, user_id): if c_type == 'audio': file_id = message.audio.file_id audio_title = message.audio.title else: file_id = message.document.file_id audio_title = message.document.file_name[:-4] client_chat_id = message.caption if client_chat_id.find(u':') != -1: client_chat_id, q_pool_obj_id, duration_s = re.split(r':',client_chat_id) # file_id   q_pool.update_request_file_id(str(q_pool_obj_id), str(file_id)) #   bot.send_audio(int(client_chat_id), file_id,caption='', duration=int(duration_s), title=audio_title, performer='') return

Questions / suggestions write in the comments or in the chat .

Source: https://habr.com/ru/post/348234/

All Articles