At the end of February, a record 108,000 people from 208 countries and territories streamed into Barcelona for 2017’s Mobile World Congress.

The telecom industry’s largest conference and exhibition is always a combination of tech demos, big-name speakers, and behind-the-scenes talks and negotiations. This year was no exception.

Tracking the Most Popular News of Mobile World CongressClick To Tweet

What everyone was talking about

To identify the topics that created the most buzz at this year’s MWC, we streamed two hours worth of tweets after the conference ended using Twitter’s Streaming API and Tweepy, a Python wrapper for the same.

After collecting every tweet containing “MWC” or “MobileWorldCongress,” we removed punctuation, retweets, and then sorted all terms used based on frequency:

Most popular twitter terms during Mobile World Congress 2017

After eliminating tweets containing MWC or 2017, we can use this data to create a list of the most-tweeted about releases at Mobile World Congress. In order of popularity:

  1. Nokia 3310: It’s the classic brick phone you know and love, revived in partnership with HMD. It’s $52, and it has Snake. The unique blend of nostalgia and simplicity that this phone embodies (only in 2017 can you “unplug” by getting a phone) has made it a hot topic of discussion.
  2. Huawei P10: The Chinese manufacturer, which hasn’t always used its appearance at MWC as an opportunity to show off its flagship line, changed that this year. It’s being suggested that the P10’s unveiling marks the start of a new attempt by Huawei to start winning over European consumers.
  3. LG G6: LG’s rebooted G6 model took home several awards at Mobile World Congress, including Best Phone and Best in Show. Interestingly, it will be one of the first non-Google Pixel phones out there to ship with the Google Assistant installed.
The Nokia 3310

The Nokia 3310 (Source: T3)

When we ran the same type of frequency analysis on the hashtags used to talk about MWC, we saw some interesting and divergent results:

The most popular twitter hashtags at Mobile World Congress 2017

Mobile World Congress-related terms took first place again. Next, though, came not the 3310, but:

  1. #IoT: The Internet of Things came out of MWC with many people talking about it, largely due to the impressive showcase of IoT solutions shown at GSMA’s Innovation City exhibit. Business Insider now predicts that 22.5 billion IoT devices will be installed in 2021 (about triple what there is today) as a result of $4.8 trillion in aggregate investment.
  2. #Google: Google owns Android, so it should be no surprise that the “pure Google” phone was a hot topic of discussion all throughout the conference. Then there was Google Assistant, pitted up against Amazon Alexa, and Google’s extensive “Android Global Village.” Lots of conversations about Mobile World Congress involved Google in one way or another.
  3. #Android: Apple’s presence at trade shows in general has been virtually nonexistent since the mid-2000s. For the other big names in mobile, MWC is the place to unveil the newest entrants in their flagship lines. 2017, Android’s tenth anniversary, was no exception, and the top Android manufacturers were all out at MWC.

Now for the technique that we used to prove the buzziest events above!

Getting started with the Twitter Streaming API

Getting started using the Twitter Streaming API yourself requires you to get app credentials, which you can acquire by setting up a developer account on Twitter. Go to https://apps.twitter.com/ and select Create New App.

Twitter app management

After filling in all the information (when you’re asked for a website, you can use a placeholder like a social media profile) click on the app you created. You’ll see a button that says Keys and Access Tokens at the top of the page, and when you click on that, you’ll see a page with the four different credentials you need to get started:

Twitter app settings

  1. Consumer Key
  2. Consumer Secret
  3. Access Token
  4. Access Token Secret

Don’t upload these to Github, at least not in a private repo, or put them up on a blog or share them everywhere.

Twitter’s APIs limit your rate pretty reliably if you attempt to connect too many times—and extend your lockout exponentially in time for every additional request you make after being blocked. If you run into the rate limiter while you’re playing around with the API, know that that’s tied to your token credentials.

Using Tweepy to start streaming tweets

Tweepy is a Python library that makes it at lot easier to quickly get started with the Twitter Streaming API. You have a few different ways you can go about installing it:

  1. PyPI: easy_install tweepy
  2. Pip: pip install tweepy
  3. Git:
git clone git://github.com/tweepy/tweepy.git
cd tweepy
python setup.py install

Once you have Tweepy installed you’re ready to write your script. First, import Tweepy into your project along with some important methods. Then, define the credentials that you created earlier and authenticate yourself to the Twitter API through Tweepy. Here’s what all of that looks like:

from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream

consumer_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxx'
consumer_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
access_token = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
access_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)

api = tweepy.API(auth)

Then, you’ll want to set up a proper stream that will take the tweets you want and output them to a JSON file. By default, Tweepy returns a stream of data that’s a little bit hard to process:

Instead, use this code (courtesy of Marco Bonzanini) to stream each tweet to a new line in a JSON file:

class MyListener(StreamListener):
 
    def on_data(self, data):
        try:
            with open('nameofyourfilehere.json', 'a') as f:
                f.write(data)
                return True
        except BaseException as e:
            print('Error on_data: %s' % str(e))
        return True 

Change “nameofyourfilehere” to whatever you want to call your output file. Next, we’ll follow the recommendation of Tweepy and set up an error handler that will automatically cancel the running of the script if we get error code 420—what Twitter sends you when you’re being rate-limited:

   def on_error(self, status_code):
        if status_code == 420:
            return False

The last step—and the last two lines of your script—are where you decide what terms you want to be tracking.

twitter_stream = Stream(auth, MyListener())
twitter_stream.filter(track=['nokia'])

Put all of this in a Python (.py) file, and upon running it you’ll automatically start streaming every tweet that includes the word “nokia” into a JSON file. Choose a topic that’s being talked about a lot, and you can pretty quickly have thousands of tweets (and hundreds of MB) of tweets on your hard drive. These tweets should look something like this:

7-screenshot

When prettified, each tweet looks like this:

{  
 **"created_at"**:"Fri Mar 10 21:24:34 +0000 2017",
 **"id"**:840312447684698113,
 **"id_str"**:"840312447684698113",
 **"text"**:"Viber in consideration of nokia e71 yet e72 masterpiece phones: csqKZS https:\/\/t.co\/fp3mHhcpyN",
 **"source"**:"\u003ca href=\"https:\/\/dlvrit.com\/\" rel=\"nofollow\"\u003edlvr.it\u003c\/a\u003e",
 **"truncated"**:false,
 **"in_reply_to_status_id"**:null,
 **"in_reply_to_status_id_str"**:null,
 **"in_reply_to_user_id"**:null,
 **"in_reply_to_user_id_str"**:null,
 **"in_reply_to_screen_name"**:null,
 **"user"**:{  
  **"id"**:1225995182,
  **"id_str"**:"1225995182",
  **"name"**:"SalomonAaliyah",
  **"screen_name"**:"SalomonAaliyah",
  **"location"**:null,
  **"url"**:null,
  **"description"**:null,
  **"protected"**:false,
  **"verified"**:false,
  **"followers_count"**:144,
  **"friends_count"**:0,
  **"listed_count"**:47,
  **"favourites_count"**:0,
  **"statuses_count"**:147297,
  **"created_at"**:"Wed Feb 27 20:34:03 +0000 2013",
  **"utc_offset"**:3600,
  **"time_zone"**:"Amsterdam",
  **"geo_enabled"**:false,
  **"lang"**:"pl",
  **"contributors_enabled"**:false,
  **"is_translator"**:false,
  **"profile_background_color"**:"C0DEED",
  **"profile_background_image_url"**:"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png",
  **"profile_background_image_url_https"**:"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png",
  **"profile_background_tile"**:false,
  **"profile_link_color"**:"1DA1F2",
  **"profile_sidebar_border_color"**:"C0DEED",
  **"profile_sidebar_fill_color"**:"DDEEF6",
  **"profile_text_color"**:"333333",
  **"profile_use_background_image"**:true,
  **"profile_image_url"**:"http:\/\/pbs.twimg.com\/profile_images\/3348387308\/21b4580dd1a0cdddef2dc5d6efb8f824_normal.jpeg",
  **"profile_image_url_https"**:"https:\/\/pbs.twimg.com\/profile_images\/3348387308\/21b4580dd1a0cdddef2dc5d6efb8f824_normal.jpeg",
  **"default_profile"**:true,
  **"default_profile_image"**:false,
  **"following"**:null,
  **"follow_request_sent"**:null,
  **"notifications"**:null
 },
 **"geo"**:null,
 **"coordinates"**:null,
 **"place"**:null,
 **"contributors"**:null,
 **"is_quote_status"**:false,
 **"retweet_count"**:0,
 **"favorite_count"**:0,
 **"entities"**:{  
  **"hashtags"**:[  

  ],
  **"urls"**:[  
   {  
    **"url"**:"https:\/\/t.co\/fp3mHhcpyN",
    **"expanded_url"**:"http:\/\/dlvr.it\/NbT7T0",
    **"display_url"**:"dlvr.it\/NbT7T0",
    **"indices"**:[  
     71,
     94
    ]
   }
  ],
  **"user_mentions"**:[  

  ],
  **"symbols"**:[  

  ]
 },
 **"favorited"**:false,
 **"retweeted"**:false,
 **"possibly_sensitive"**:false,
 **"filter_level"**:"low",
 **"lang"**:"en",
 **"timestamp_ms"**:"1489181074435"
}

A lot of data to work with!

From here, the tweet is your oyster. The possibilities are pretty much unlimited for what you want to do.

If you’re just getting started, work your way through the aforementioned Marco Bonzanini’s “Mining Twitter Data with Python” tutorial. You’ll learn how to pre-process tweets for easier analyzing later, do frequency analysis, co-occurrence analysis, sentiment analysis, and some basic data visualization.

If you’re interested in doing more complex analysis of the linguistic aspects of tweets, Google’s Cloud Natural Language API combined with BigQuery can be a good way to get some interesting results. More basic sentiment analysis, of course, can be done using any variety of projects available on Github—for example, TweetFeels.

Interested in the emojis being used on Twitter to talk about Nicki Minaj upon the release of her latest track, “No Frauds”? Easy!

emoji

With the jQCloud Javascript library, you can use the raw JSON generated by your Python scripts to create nice emoji word clouds like this.

Once you have your basic tweet-streaming script up and running, you can pretty quickly execute some interesting analysis of topics as they start happening in real time. Have you done any experimenting with live Twitter data? Let us know in the comments below!