Ajitabh Pandey's Soul & Syntax

Exploring systems, souls, and stories – one post at a time

Category: Programming

  • Using Telegram for Automation Using Python Telethon Module

    Using Telegram for Automation Using Python Telethon Module

    Telegram is a cloud based messaging application which provides an excellent set of APIs to allow developers to automate on top of the platform. It is increasingly being used to automate various notifications and messages. It has become a platform of choice to create bots which interact with users and groups.

    Telethon is an asyncio Python 3 library for interacting with telegram API. It is one of the very exhaustive libraries which allows users to interact with telegram API as a user or as a bot.

    Recently I have written some AWS Lambda functions to automate certain personal notifications. I could have run the code as a container on one of my VPSs or on Hulu or other platforms, but I took this exercise as an opportunity to learn more about serverless and functions. Also, my kind of load is something which can easyly fall under the Lambda free tier.

    In this post we will look into the process of how to start with the development and write some basic python applications.

    Registering As a Telegram Developer

    Following steps can be followed to obtain the API ID for telegram –

    • Sign up for Telegram using any application
    • Login to the https://my.telegram.org/ website using the same mobile number. Telegram will send you a confirmation code on Telegram application. After entering the confirmation code, you will be seeing the following screen –
    Screenshot of Telegram Core Developer Page
    • In the above screen select the API Development Tools and complete the form. This page will provide some basic information in addition to api_id and api_hash.

    Setting up Telethon Development Environment

    I assume that the reader is familiar with basic python and knows how to set up a virtual environment, so rather than explaining, I would more focus on quick code to get the development environment up and running.

    $ mkdir telethon-dev && cd telethon-dev 
    $ python3 -m venv venv-telethon
    $ source venv-telethon/bin/activate
    (venv-telethon) $ pip install --upgrade pip
    (venv-telethon) $ pip install telethon
    (venv-telethon) $ pip install python-dotenv

    Obtaining The Telegram Session

    I will be using .env file for storing the api_id and api_hash so that the same can be used in the code which we will write. Replace NNNNN with your api_id and XX with your api_hash

    TELEGRAM_API_ID=NNNNN
    TELEGRAM_API_HASH=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    Next we will need to create a session to be used in our code. For full automation, it is needed that we store the session either as a file or as a string. Since the cloud environments destroy the ephimeral storage they provide, so I will get the session as a string. The following python code will help obtain the same.

    #! /usr/bin/env python3
    
    import os
    
    from dotenv import load_dotenv
    
    from telethon.sync import TelegramClient
    from telethon.sessions import StringSession
    
    load_dotenv()
    
    with TelegramClient(StringSession(), os.getenv("TELEGRAM_API_ID"), os.getenv("TELEGRAM_API_HASH")) as client:
        print(client.session.save())

    When this code is executed, it will prompt for your phone number. Here you would need to print the phone number with the country code. In the next step, an authorization code will be received in the telegram application which would need to be entered in the application prompt. Once the authorization code is typed correctly, the session will be printed as a string value on standard output. You would need to save the same.

    (venv-telethon) $ ./get_string_session.py
     Please enter your phone (or bot token): +91xxxxxxxxxx
     Please enter the code you received: zzzzz
    Signed in successfully as KKKKKK KKKKKKK
    9vznqQDuX2q34Fyir634qgDysl4gZ4Fhu82eZ9yHs35rKyXf9vznqQDuX2q34Fyir634qgDyslLov-S0t7KpTK6q6EdEnla7cqGD26N5uHg9rFtg83J8t2l5TlStCsuhWjdzbb29MFFSU5-l4gZ4Fhu9vznqQDuX2q34Fyir634qgDysl9vznqQDuX2q34Fyir634qgDy_x7Sr9lFgZsH99aOD35nSqw3RzBmm51EUIeKhG4hNeHuF1nwzttuBGQqqqfao8sTB5_purgT-hAd2prYJDBcavzH8igqk5KDCTsZVLVFIV32a9Odfvzg2MlnGRud64-S0t7KpTK6q6EdEnla7cqGD26N5uHg9rFtg83J8t2l5TlStCsuhWjdzbb29MFFSU5=

    I normally put the string session along with the API ID and Hash in the .env file. All these three values would need to be protected and should never be shared with a third party.

    For the next code, I will assume that you have used a variable TELEGRAM_STRING_SESSION. So the final .env file will look like below –

    TELEGRAM_API_ID=NNNNN
    TELEGRAM_API_HASH=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    TELEGRAM_STRING_SESSION=YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY

    Sending a Message to A Contact

    Now we have the ground work done, so we will write a simple python application to send a message to a contact. The important point to note here is that the recipient must be in your telegram contacts.

    #! /usr/bin/env python3
    
    import os
    
    from telethon.sync import TelegramClient
    from telethon.sessions import StringSession
    from dotenv import load_dotenv
    
    load_dotenv()
    
    try:
        client = TelegramClient(StringSession(os.getenv("STRING_TOKEN")), os.getenv("API_ID"), os.getenv("API_HASH"))
        client.start()
    except Exception as e:
        print(f"Exception while starting the client - {e}")
    else:
        print("Client started")
    
    async def main():
        try:
            # Replace the xxxxx in the following line with the full international mobile number of the contact
            # In place of mobile number you can use the telegram user id of the contact if you know
            ret_value = await client.send_message("xxxxxxxxxxx", 'Hi')
        except Exception as e:
            print(f"Exception while sending the message - {e}")
        else:
            print(f"Message sent. Return Value {ret_value}")
    
    with client:
        client.loop.run_until_complete(main())

    Next Steps

    The telethon API is quite versatile, a detailed API documentation can be find at https://tl.telethon.dev/. Hope this post will help the reader quickly start off with the telegram messaging with telethon module.

  • Adding Custom Python Packages for AWS Lambda Functions

    Python is a popular language along with Javascript (NodeJS) for writing AWS lambda functions. Lambda function written in Python support the core modules, so one may choose to use the http.client instead of much simpler requests. However, if the function is to use some custom or non-native packages such as request and response we have few methods available to us.

    In this article I will be discussing one such method of uploading a zip file containing all such custom packages and adding an AWS Lambda Layer to use this zip file for the particular function. We will be making use of Docker containers for this process. To be honest we actually do not need to go through the process of using a docker container. We can use only a simple pip install -t, zip the directory and upload it. However, certain python modules need to compile extensions written in C or C++. For such modules, the pip install -t approach will not work as the AWS Lambda functions use AWS Linux environment and you may have OSX, Windows or any other linux distribution of your choice. If you are sure that your modules do not have compiled extensions, please follow steps 2 and 3 below in this post.

    Step 1 – Build and Run the Docker Container

    The pre-requisite for this step is to have Docker installed. If you are on OSX, you can use Docker for Desktop. In this step we will use Amazon Linux base image and install desired version of python and few modules and OS packages. Amazon Linux 2 is a long term support release available at this moment. Amazon Linux 2 provides amazon-linux-extras which allows availability of newest application software on a stable base. At the time of writing this, Python 2.7 has been depricated by Amazon and the recommended version is Python 3.8. We would be needing to use amazon-linux-extras to install Python 3.8. Following Dockerfile is a very simple and self-explanatory file which we will be using to build our container –

    FROM amazonlinux:2
    
    RUN amazon-linux-extras enable python3.8 && \
              yum install -y python38 && \
              yum install -y python3-pip && \
              yum install -y zip && \
              yum clean all
    
    RUN python3.8 -m pip install --upgrade pip && \
              python3.8 -m pip install virtualenv

    Build the container using the following command –

    $ docker build -f Dockerfile.awslambda -t aws_lambda_layer:latest

    Once the container is built, it can be run as –

    user1@macbook-air $ docker run -it --name aws_lambda_layer aws_lambda_layer:latest bash

    This will give the bash shell inside the container. Next step will install the required modules in a python 3.8 virtual-environment and package as a zip file

    Step 2 – Install Non-Native Packages and Package These As A Zip File

    We will install the required packages inside a virtual environment, this will allow to reuse the same container for other future packaging also.

    # python3.8 -m venv venv-telethon

    Next, activate the virtual environment and install the packages in it under a specific folder, so that the same can be packaged. After packaging the folder, the zip file needs to be copied outside the container so that the same can be uploaded –

    # source venv-telethon/bin/activate
    (venv-telethon) # pip install telethon -t ./python
    (venv-telethon) # deactivate
    
    # zip -r python.zip ./python/
    
    user1@macbook-air $ docker cp aws_lambda_layer:python.zip ./Desktop/

    Step 3 – Upload the Package to the AWS Lambda Layer or S3

    If the zip file is more than 50 MB, it has to be uploaded to Amazon S3 store. In case you decide to upload it S3 store, ensure that the path of the file is recorded carefully.

    To upload the file, under the Lambda->Layer, click on Create Layer and fill up the form. The form will allow to upload the zip file or specify the path of the AWS S3 location where the file is uploaded.

    Now write your lambda function and use the modules which were uploaded as a part of the zip file.

  • Build An MP3 Catalogue System With Perl – Conclusion

    In the last post we saw how to read ID3v1 and ID3v2 tags using perl. In this post we will continue our journey towards creating a simple catalog for the MP3 collection.

    Quickly Getting the Desired Information out of the MP3 – autoinfo()

    Usually in my catalog I am interested in the following information about an MP3 – Title, Artist, Album, Track, Year, Genre, Comment and the Artwork. However, I do not want to loop through all available information in my program to get this data. Fortunately the MP3::Tag module provides a autoinfo() function which gets almost all the information needed for us except the Artwork, which we may need to gather separately. The autoinfo() function returns the information about the title, album, artist, track, tear, genre and comment. This information is obtained from ID3v2 tag, ID3v1 tag, CDDB file, .inf file and the mp3 filename itself, where-ever it is found first. The order of this lookup can be changed with the config() command. I want to restrict my cataloging to only ID3v2  and ID3v1 tags.

    Following lines provides us with the needed information.

    $mp3->config("autoinfo", "ID3v2", "ID3v1");
    my ($title, $track, $artist, $album, $comment, $year, $genre) = $mp3->autoinfo();

    Getting Artwork Information

    The artwork information is stored in the ID3v2 tag in a frame called APIC (stands for Attached PICture). This frame has _Data and MIME Type which we would need for our purpose. In order to extract this frame and its data we do not need to loop in through all the tags. The MP3::Tag module provides us with the get_frame() method, using which we can extract any frame directly like as shown below for artwork –

    my $apic_frame = $mp3->{ID3v2}->get_frame("APIC");
    my $img_data = $$apic_frame{'_Data'};
    my $mime_type = $$apic_frame{'MIME type'};

    This $img_data can be written out in a file and the $mime_type can be used as an extension. Thus we can extract the artwork from the MP3 file. The MIME type is something like “image/jpeg” and I have used the split function to get the string for the extension of the file.

    my ($mime1, $mime2) = split(/\//, $mime_type);
    my $artwork_name = "artwork.$mime2";
    open ARTWORK_FILE, ">$artwork_name" 
      or die "Error creating the artwork file";
    binmode(ARTWORK_FILE);
    print ARTWORK_FILE $img_data;
    close ARTWORK_FILE;

    Generating the HTML using HTML

    This is a simple project so I have used HTML::Template module to generate HTML code to the standard output, which can then in turn be redirected to a file using shell redirection. For a making the table layout less cumbersome, I have used the purecss.io CSS framework. Here my HTML template code.

    my $template = <<HTML;
    <html>
    <head>
    <title>My MP3 Catalog</title>
    <link rel="stylesheet" href="http://yui.yahooapis.com/pure/0.5.0/pure-min.css">
    </head>
    <body>
    <h1>My MP3 Collection</h1>
    <table class="pure-table pure-table-horizontal">
        <thead>
            <tr>
                <th>Album Artwork</th>
    			<th>Album</th>
                <th>Track</th>
                <th>Title</th>
                <th>Artist</th>
    			<th>Year</th>
    			<th>Genre</th>
    			<th>Comment</th>
            </tr>
        </thead>
    
        <tbody>
    		<!-- TMPL_LOOP NAME=SONGS -->
    		<tr>
    			<td><a src="<TMPL_VAR NAME=FILEPATH>"><img src="<TMPL_VAR NAME=IMG>" height="150" width="150"/></a></td>
    			<td><!-- TMPL_VAR NAME=ALBUM --></td>
    			<td><!-- TMPL_VAR NAME=TRACK --></td>
    			<td><!-- TMPL_VAR NAME=TITLE --></td>
    			<td><!-- TMPL_VAR NAME=ARTIST --></td>
    			<td><!-- TMPL_VAR NAME=YEAR --></td>
    			<td><!-- TMPL_VAR NAME=GENRE --></td>
    			<td><!-- TMPL_VAR NAME=COMMENT --></td>
    		</tr>
    		<!-- /TMPL_LOOP -->
        </tbody>
    </table>
    
    </body>
    </html>
    HTML
    my $tmpl = HTML::Template->new(scalarref => \$template);

    Complete Script

    The complete script is on github, you can have a look at –

    https://github.com/ajitabhpandey/learn-programming/blob/master/perl/id3-tags-manipulation/genCatalog.pl.

  • Build An MP3 Catalogue System With Perl – Basics

    My mp3 collection was increasing and I wanted to build a catalogue for the same. There are various steps in having even a simple catalogue system. In this post and a few posts that will follow, I will be explaining how to write such a system using perl as the programming language.

    MP3 Format and ID3 Tags

    MP3 is an audio coding format for digital audio. The audio data in this file is in a compressed format. The compression is a lossy compression, meaning the sound quality is not very clear. In spite of being a lossy format, it is one of the most popular format for audio streaming and storage. The mp3 file has built in bibliographic information such as title, artist, album. This information is stored in a field inside the file known as ID3 tag. Using this information, the MP3 players are able to display the Song Title, Album name and Artist name(s) etc.

    There are couple of versions of these ID3 tags in use. ID3v1 (1.1 being the last in version 1 series) and ID3v2 (ID3v2.4 being the latest version).

    Perl MP3::Tag Module

    The MP3::Tag module of perl can be used to read and write both the versions of the ID3 tag. Here are few of the sample perl programs to do that. This will help in understanding the usage of the modules before we proceed to the next steps.

    Below are couple of examples showing how to read ID3v1 and ID3v2 tags.

    #!/usr/bin/perl
    #
    # id3v1_read.pl
    #
    use 5.010;
    use warnings;
    use strict;
    use MP3::Tag;
    
    # set filename of MP3 track
    my $filename = "your_mp3_file";
    
    # create new MP3-Tag object
    my $mp3 = MP3::Tag->new($filename);
    
    # get tag information
    $mp3->get_tags();
    
    # check to see if an ID3v1 tag exists
    # if it does, print track information
    if (exists $mp3->{ID3v1}) {
      #$mp3->{ID3v1}->remove_tag();exit;
    
      say "Filename: $filename";
      say "Artist: " . $mp3->{ID3v1}->artist;
      say "Title: " . $mp3->{ID3v1}->title;
      say "Album: " . $mp3->{ID3v1}->album;
      say "Year: ". $mp3->{ID3v1}->year;
      say "Genre: " . $mp3->{ID3v1}->genre;
    } else {
      say "$filename: ID3v1 tag not found";
    }
    
    # clean up
    $mp3->close();

    ID3v2 tags are a bit more complex as they allow a lot more information to be stored in the MP3 file such as the album artwork etc. If you run the following script on one of your MP3 files it will print all the ID3v2 information to the screen. I have used the getc() function in order to allow you to observe the output and press <Enter> to proceed to the next set of key-value pair. After couple of keystrokes you will see there are lot of junk characters printed. These junk characters are nothing but the album artwork and following the junk characters is the MIME type of the artwork. In my case the MIME types were all “image/jpeg”.

    #!/usr/bin/perl
    #
    # id3v2_read.pl
    #
    use 5.010;
    use warnings;
    use strict;
    use MP3::Tag;
    
    # set filename of MP3 track
    my $filename = "mp3_file_name;
    
    # create new MP3-Tag object
    my $mp3 = MP3::Tag->new($filename);
    
    # get tag information
    $mp3->get_tags();
    
    # check to see if an ID3v2 tag exists
    # if it does, print track information
    if (exists $mp3->{ID3v2}) {
      # get a list of frames as a hash reference
      my $frames = $mp3->{ID3v2}->get_frame_ids();
    
      # iterate over the hash, process each frame
      foreach my $frame (keys %$frames) {
        # for each frame get a key-value pair of content-description
        my ($value, $desc) = $mp3->{ID3v2}->get_frame($frame);
        if (defined($desc) and length $desc) {
          say "$frame $desc: "; 
        } else {
          say "$frame :";
        }
        # sometimes the value is itself a hash reference containing more values
        # deal with that here
        if (ref $value eq "HASH") {
          while (my ($k, $v) = each (%$value)) {
            say "\n     - $k: $v";
          }
        } else {
          say "$value";
        }
        # allows to view each iteration
        getc(STDIN);
      }
    } else {
      say "$filename: ID3v2 tag not found";
    }
    
    # clean up
    $mp3->close();

    Next Steps

    Most of the current MP3 players read ID3v2 tags. It will be good to understand the structure of the ID3v2 tags, using one of the links I provided above. This will help you prepare for understanding the further articles in this series. In next part we will see how to extract desired information quickly and how to extract the artwork data from the MP3 file. Stay tuned for more.