Quick Start¶
In this section, we go over everything you need to know to start building scripts, or bots using PRAW, the Python Reddit API Wrapper. It’s fun and easy. Let’s get started.
Prerequisites¶
Python Knowledge: | |
---|---|
You need to know at least a little Python to use PRAW; it’s a Python wrapper after all. PRAW supports Python 2.7, and Python 3.3 to 3.6. If you are stuck on a problem, /r/learnpython is a great place to ask for help. | |
Reddit Knowledge: | |
A basic understanding of how reddit.com works is a must. In the event you are not already familiar with Reddit start with their FAQ. | |
Reddit Account: | A Reddit account is required to access Reddit’s API. Create one at reddit.com. |
Client ID & Client Secret: | |
These two values are needed to access Reddit’s API as a script application (see Authenticating via OAuth for other application types). If you don’t already have a client ID and client secret, follow Reddit’s First Steps Guide to create them. | |
User Agent: | A user agent is a unique identifier that helps Reddit determine
the source of network requests. To use Reddit’s API, you need a
unique and descriptive user agent. The recommended format is
<platform>:<app ID>:<version string> (by /u/<Reddit
username>) . For example,
android:com.example.myredditapp:v1.2.3 (by /u/kemitche) . Read
more about user-agents at Reddit’s API wiki page. |
With these prerequisites satisfied, you are ready to learn how to do some of the most common tasks with Reddit’s API.
Common Tasks¶
Obtain a Reddit
Instance¶
You need an instance of the Reddit
class to do anything with
PRAW. There are two distinct states a Reddit
instance can be in:
read-only, and authorized.
Read-only Reddit
Instances¶
To create a read-only Reddit
instance, you need three pieces of
information:
- client ID
- client secret
- user agent
You may choose to provide these by passing in three keyword arguments when
calling the initializer of the Reddit
class: client_id
,
client_secret
, user_agent
(see Configuring PRAW for other methods
of providing this information). For example:
import praw
reddit = praw.Reddit(client_id='my client id',
client_secret='my client secret',
user_agent='my user agent')
Just like that, you now have a read-only Reddit
instance.
print(reddit.read_only) # Output: True
With a read-only instance, you can do something like obtaining 10 ‘hot’
submissions from /r/learnpython
:
# continued from code above
for submission in reddit.subreddit('learnpython').hot(limit=10):
print(submission.title)
# Output: 10 submission
If you want to do more than retrieve public information from Reddit, then you
need an authorized Reddit
instance.
Note
In the above example we are limiting the results to 10. Without the
limit
parameter PRAW should yield as many results as it can with
a single request. For most endpoints this results in 100 items per
request. If you want to retrieve as many as possible pass in
limit=None
.
Authorized Reddit
Instances¶
In order to create an authorized Reddit
instance, two additional
pieces of information are required for script applications (see
Authenticating via OAuth for other application types):
- your Reddit user name, and
- your Reddit password
Again, you may choose to provide these by passing in keyword arguments
username
and password
when you call the Reddit
initializer,
like the following:
import praw
reddit = praw.Reddit(client_id='my client id',
client_secret='my client secret',
user_agent='my user agent',
username='my username',
password='my password')
print(reddit.read_only) # Output: False
Now you can do whatever your Reddit account is authorized to do. And you can switch back to read-only mode whenever you want:
# continued from code above
reddit.read_only = True
Note
If you are uncomfortable hard coding your credentials into your program, there are some options available to you. Please see: Configuring PRAW.
Obtain a Subreddit
¶
To obtain a Subreddit
instance, pass the subreddit’s name when
calling subreddit
on your Reddit
instance. For example:
# assume you have a Reddit instance bound to variable `reddit`
subreddit = reddit.subreddit('redditdev')
print(subreddit.display_name) # Output: redditdev
print(subreddit.title) # Output: reddit Development
print(subreddit.description) # Output: A subreddit for discussion of ...
Obtain Submission
Instances from a Subreddit
¶
Now that you have a Subreddit
instance, you can iterate through some
of its submissions, each bound to an instance of Submission
. There
are several sorts that you can iterate through:
- controversial
- gilded
- hot
- new
- rising
- top
Each of these methods will immediately return a ListingGenerator
,
which is to be iterated through. For example, to iterate through the first 10
submissions based on the hot
sort for a given subreddit try:
# assume you have a Subreddit instance bound to variable `subreddit`
for submission in subreddit.hot(limit=10):
print(submission.title) # Output: the submission's title
print(submission.score) # Output: the submission's score
print(submission.id) # Output: the submission's ID
print(submission.url) # Output: the URL the submission points to
# or the submission's URL if it's a self post
Note
The act of calling a method that returns a ListingGenerator
does not result in any network requests until you begin to iterate
through the ListingGenerator
.
You can create Submission
instances in other ways too:
# assume you have a Reddit instance bound to variable `reddit`
submission = reddit.submission(id='39zje0')
print(submission.title) # Output: reddit will soon only be available ...
# or
submission = reddit.submission(url='https://www.reddit.com/...')
Obtain Redditor
Instances¶
There are several ways to obtain a redditor (a Redditor
instance).
Two of the most common ones are:
- via the
author
attribute of aSubmission
orComment
instance - via the
redditor()
method ofReddit
For example:
# assume you have a Submission instance bound to variable `submission`
redditor1 = submission.author
print(redditor1.name) # Output: name of the redditor
# assume you have a Reddit instance bound to variable `reddit`
redditor2 = reddit.redditor('bboe')
print(redditor2.link_karma) # Output: bboe's karma
Obtain Comment
Instances¶
Submissions have a comments
attribute that is a CommentForest
instance. That instance is iterable and represents the top-level comments of
the submission by the default comment sort (best
). If you instead want to
iterate over all comments as a flattened list you can call the list()
method on a CommentForest
instance. For example:
# assume you have a Reddit instance bound to variable `reddit`
top_level_comments = list(submission.comments)
all_comments = submission.comments.list()
Note
The comment sort order can be changed by updating the value of
comment_sort
on the Submission
instance prior to
accessing comments
(see: /api/set_suggested_sort for
possible values). For example to have comments sorted by new
try
something like:
# assume you have a Reddit instance bound to variable `reddit`
submission = reddit.submission(id='39zje0')
submission.comment_sort = 'new'
top_level_comments = list(submission.comments)
As you may be aware there will periodically be MoreComments
instances
scattered throughout the forest. Replace those MoreComments
instances
at any time by calling replace_more()
on a CommentForest
instance. Calling replace_more()
access comments
, and so must be done
after comment_sort
is updated. See Extracting comments with PRAW for an example.
Determine Available Attributes of an Object¶
If you have a PRAW object, e.g., Comment
, Message
,
Redditor
, or Submission
, and you want to see what
attributes are available along with their values, use the built-in
vars()
function of python. For example:
import pprint
# assume you have a Reddit instance bound to variable `reddit`
submission = reddit.submission(id='39zje0')
print(submission.title) # to make it non-lazy
pprint.pprint(vars(submission))
Note the line where we print the title. PRAW uses lazy objects so that network
requests to Reddit’s API are only issued when information is needed. Here,
before the print line, submission
points to a lazy Submission
object. When we try to print its title, additional information is needed, thus
a network request is made, and the instances ceases to be lazy. Outputting all
the attributes of a lazy object will result in fewer attributes than expected.