We'd like to introduce you to firehose, a tool that allows you to easily use the Twitter Streaming API with PHP!

Hello.
I'm Mandai, the Wild team member in charge of development.

I'm not sure if Twitter is still popular these days, or if it's already well-established, but Ithe Twitter Streaming APIthought I'd try using
Writing a program from scratch would be too difficult, so I tried using a convenient library, and it turned out to be incredibly easy, so I'd like to share it with you.

Obtain an access token and access secret for the Twitter API

Nothing can begin without this, so let's get it quickly.
You can either create a dedicated account or use an existing user account.

Log in to Twitter andthis page. You will find a link called "Create New App," so register using that link.

The Streaming API has no limits on the number of accesses (or rather, since it's streaming, the idea is to keep the connection open and continuously acquire data), so there's no need to worry about being banned for overuse.
However, it seems that if you repeatedly retry without establishing a connection, you may be stopped, so you will need to incorporate reconnection and response monitoring into your program.

It takes time to develop this, so this time we will introduce a library that will cover that aspect

 

Introducing Firehose, which makes it easy to use the Streaming API

phirehoseis a PHP library that handles communication with the Twitter Streaming API.

This provides classes that handle everything from authentication to connection and data retrieval, so all you have to do is copy and paste the sample and fill in the access key information, and run it from the console to retrieve the data

It's a public repository on GitHub, so you can use it immediately by git cloning it

# In a directory within the project, git clone https://github.com:fennb/phirehose.git

 

That's all for now, so I'd like to introduce a few more classes that correspond to the Streaming API endpoints

 

Public Streams

Public Streams is an API that retrieves data from the entire Twitter timeline.
There are two endpoints.

 

POST statuses / filter

The POST statuses/filterAPI is quite user-friendly, as it reads GET parameters despite being defined using the POST method.

Narrow down the tweets to be read using three criteria: Twitter user ID, keywords, and location information

 

follow (user ID)

The user IDs to filter.
In phirehose, these are passed as an array using the setFollow() method.

 

track (keyword)

Keywords for filtering.
In phirehose, these are passed as an array to the setTrack() method.

 

location

The location information to be filtered.
The selection is made using a rectangle represented by two points, the bottom left and top right corners. Therefore, if you want to filter only within Japan, you need to combine multiple location information points.
In phirehose, you pass this as an array using the setLocations() method.

It must also always be a two-dimensional array

  1. Bottom left (southwesternmost) longitude
  2. Bottom left (southwesternmost) latitude
  3. Upper right (northeasternmost) longitude
  4. Upper right (northeasternmost) latitude

You will pass an array as an argument, which will contain an array of values ​​arranged in that order.
Therefore, by setting multiple rectangles, it is possible to create an area that covers the entire country of Japan.

You can also filter to target only certain major cities

 

GET statuses / sample

The GET statuses / sampleAPI retrieves a small sample of tweets randomly selected from all tweets on Twitter.
It seems to represent only about 1% of the total data, but even so, the volume of activity is quite high, so be careful.

This has no options

 

User Streams

This API targets a single user and retrieves their timeline, profile updates, events, and other information.
Streaming from an unspecified number of user accounts via this API could trigger connection restrictions from the same IP address, so we'll consider alternative methods.

The User Streams API also gives you the option to include replies and retweets to the target user

* In phirehose, there is a dedicated method for the track option, but there does not seem to be a dedicated method for with and replies. However, since we did not use the User Streams API this time, we have not investigated how to set it up, so we will not mention it here (this is only a translation of the documentation provided by Twitter)

 

with (handling other accounts the user follows)

By default, "with=followings" is set, so data about the user and the users they follow will be included in the stream

"with=user" will only include data for the account user

 

replies (handling replies)

By default, only replies between users who are followers of each other are streamed, but you can receive all replies by specifying "replies=all"

 

track (additional tweets by keyword)

By setting up a track, you can include additional tweets that match your keywords in your stream

 

Site Streams

The Site Stream API is currently in beta (and has been for quite some time), and while the User Streams API only covers one user's timeline, it combines the timelines of multiple users into a single stream

As it is a beta version, there are various limitations, such as the following. Will it ever become GA?

  • A single connection delivers timelines for 100 users (and other data such as profile updates).Control StreamsThis can be expanded to up to 1000 users using
  • Up to 25 connections per second. You must implement exponential back-off in case of errors due to overcalling etc
  • If you open more than about 1,000 connections, you'll need to coordinate testing and launch with the Twitter Platform team. (I looked up what the Twitter Platform Team means but couldn't figure it out. Is it a team within Twitter? I'm pretty skeptical, so don't take it at face value.)

 

Sample of JSON data that can be obtained

Here's a sample of the retrieved JSON data.
Each data entry is relatively large, and you can get a variety of data.

array(25) { ["created_at"]=> string(30) "Mon Mar 27 04:41:07 +0000 2017" ["id"]=> float(0.000000000000E+17) ["id_str"]=> string(18) "0000000000000000" ["text"]=> string(78) "xxxxxxxxxxxxxxx" ["source"]=> string(82) "xxxxxxxxxxxxxxxx" ["truncated"]=> bool(false) ["in_reply_to_status_id"]=> NULL ["in_reply_to_status_id_str"]=> NULL ["in_reply_to_user_id"]=> NULL ["in_reply_to_user_id_str"]=> NULL ["in_reply_to_screen_name"]=> NULL ["user"]=> array(38) { ["id"]=> float(0000000000) ["id_str"]=> string(10) "0000000000" ["name"]=> string(13) "xxx xxx" ["screen_name"]=> string(8) "xxxxxxxx" ["location"]=> string(27) "xxxxxxxxxxxxxxxx" ["url"]=> NULL ["description"]=> string(135) "xxxxxxxxxx" ["protected"]=> bool(false) ["verified"]=> bool(false) ["followers_count"]=> int(669) ["friends_count"]=> int(533) ["listed_count"]=> int(1) ["favourites_count"]=> int(2267) ["statuses_count"]=> int(3727) ["created_at"]=> string(30) "Fri Mar 20 09:23:52 +0000 2015" ["utc_offset"]=> NULL ["time_zone"]=> NULL ["geo_enabled"]=> bool(true) ["lang"]=> string(2) "ja" ["contributors_enabled"]=> bool(false) ["is_translator"]=> bool(false) ["profile_background_color"]=> string(6) "C0DEED" ["profile_background_image_url"]=> string(48) "xxxxxxxxxxxx" ["profile_background_image_url_https"]=> string(49) "xxxxxxxxxxxx" ["profile_background_tile"]=> bool(false) ["profile_link_color"]=> string(6) "1DA1F2" ["profile_sidebar_border_color"]=> string(6) "C0DEED" ["profile_sidebar_fill_color"]=> string(6) "DDEEF6" ["profile_text_color"]=> string(6) "333333" ["profile_use_background_image"]=> bool(true) ["profile_image_url"]=> string(74) "xxxxxxxxxx" ["profile_image_url_https"]=> string(75) "xxxxxxxxxx" ["profile_banner_url"]=> string(59) "xxxxxxxxxx" ["default_profile"]=> bool(true) ["default_profile_image"]=> bool(false) ["following"]=> NULL ["follow_request_sent"]=> NULL ["notifications"]=> NULL } ["geo"]=> NULL ["coordinates"]=> NULL ["place"]=> array(9) { ["id"]=> string(16) "5ab538af7e3d614b" ["url"]=> string(56) "https://api.twitter.com/1.1/geo/id/5ab538af7e3d614b.json" ["place_type"]=> string(4) "city" ["name"]=> string(16) "Yokohama City Asahi Ward" ["full_name"]=> string(23) "Kanagawa Yokohama City Asahi Ward" ["country_code"]=> string(2) "JP" ["country"]=> string(6) "Japan" ["bounding_box"]=> array(2) { ["type"]=> string(7) "Polygon" ["coordinates"]=> array(1) { [0]=> array(4) { [0]=> array(2) { [0]=> float(139.488892) [1]=> float(35.440878) } [1]=> array(2) { [0]=> float(139.488892) [1]=> float(35.506665) } [2]=> array(2) { [0]=> float(139.570535) [1]=> float(35.506665) } [3]=> array(2) { [0]=> float(139.570535) [1]=> float(35.440878) } } } } ["attributes"]=> array(0) { } } ["contributors"]=> NULL ["is_quote_status"]=> bool(false) ["retweet_count"]=> int(0) ["favorite_count"]=> int(0) ["entities"]=> array(4) { ["hashtags"]=> array(0) { } ["urls"]=> array(0) { } ["user_mentions"]=> array(0) { } ["symbols"]=> array(0) { } } ["favorited"]=> bool(false) ["retweeted"]=> bool(false) ["filter_level"]=> string(3) "low" ["lang"]=> string(2) "ja" ["timestamp_ms"]=> string(13) "1490589667759" }

 

summary

Implementing OAuth authentication for the Twitter API was a pain, and although it's now commonplace, there were a lot of things that had to be implemented in the past, such as handling access tokens. However, it's nice to know that there is a library that allows you to obtain data that satisfies your needs in just an hour, not five minutes, no matter how much you struggle with it

That's all

If you found this article helpful,please give it a "Like"!
0
Loading...
0 votes, average: 0.00 / 10
1,417
X Facebook Hatena Bookmark pocket

The person who wrote this article

About the author

Yoichi Bandai

My main job is developing web APIs for social games, but thankfully I'm also given the opportunity to work on various other tasks, including marketing.
My image rights within Beyond are treated as CC0.