The Canvas API and Assigned Discussions

The university where I work uses Canvas, a Learning Management System. It does a lot, but there's almost always something you'd like it to do that it can't. Luckily, Canvas comes with a RESTful API.

In some circumstances, I like to have students read some assigned piece before class so that we can discuss it during our meeting. I ask students to post a question or two based on their reading to a discussion board, which is also set up as an assignment. Doing this well and on time is a significant portion of the participation grade in some classes. But it isn't easy to quickly ascertain which assignments in a Canvas course are discussions nor is it straight-forward to find out how often (or not) a student has been posting. It can also be useful to know which posts are submitted after the due date, which is also not straightforward to figure out quickly for a large class.

I've solved this problem, for my own purposes at least, with a Python script that consumes data from the Canvas API. Here's an annotated snippet including the (dirty) logic for seeing which students have posted for each discussion and how many of their posts were submitted late:

# Decorator for adding this function to a click (cli module) command-group
#  which means this function becomes a subcommand, 
#  e.g. mycommand mysubcommand arg1 arg2 ...
# Tells click to inject the group context into the function's
#  arguments (as variable "ctx", below).
# Require one argument, the course id, and inject it as argument "courseid".
# Accept two options, one for the proportional value of late
#  posts and another for the maximum points.
#  Both are injected into the function as args.
@click.option('--latecredit', default=.5, type=float)
@click.option('--maxpoints', default=50, type=float)
def participation(ctx, courseid, latecredit, maxpoints):
    # The command group function runs first and sets up the 
    #  Canvas API token from an environment variable.
    #  For convenience, this is stashed in a local variable.
    token = ctx.obj['TOKEN']
    # Call the Canvas API through a wrapper function (not included here),
    #  using the above auth token and a path, e.g. courses/1234
    course = api(token, 'courses/{}'.format(courseid))
    # Seems to come down as a list, at least some of the time, so check for that.
    if type(course) is list:
        course = course[0]
    # Get enrollment records (all course-user objects).
    enrollments = course['enrollments']
    # We could use a query string parameter above to filter these, but this is fine.
    #  Use above to see if owner of the token is a teacher in this course.
    if not any(map(lambda en: en['type'] == 'teacher', enrollments)):
        click.secho('you are not enrolled as a teacher. abort.\n', fg='red')
        return 1
    students = []
    # Fetched paginated course-users data, limited to students.
    pages = api(token, 'courses/{}/users'.format(courseid), enrollment_type='student')
    # Build a list of students with just the following: user id, name, and a
    #  sortable name.
    for users in pages:
        for user in users:
            students.append(tuple((user[k] for k in ('id', 'name', 'sortable_name'))))
    disc_assigns = []
    # Fetch assignments.
    pages = api(token, 'courses/{}/assignments'.format(courseid))
    due_dates = {}
    for assignments in pages:
        for assignment in assignments:
            # We're interested in assignments associated with a discussion topic.
            #  And we only want published assignments...and it seems the published parameter
            #  doesn't work on the api endpoint above.
            if 'published' in assignment and assignment['published']:
            if 'discussion_topic' in assignment:
                # Grab the topic id and store it. Also grab the due date and parse it
                #  for comparison with submission dates later on.
                topicid = assignment['discussion_topic']['id']
                due_dates[topicid] = dtparse(assignment['due_at'])
    # Create (per student, keyed w/ student id) dictionaries for student 
    #  participation (per topic) and late submissions.
    students_part = {sid:[] for sid, _, _ in students}
    students_late = {sid:[] for sid, _, _ in students}
    for topicid in disc_assigns:
        # Fetch the discussion topic "view" object, which brings a load of
        #  data, including the participants and their entries.
        disc_view = api(
            'courses/{}/discussion_topics/{}/view'.format(courseid, topicid)
        # Comes down as a list, again, at least some of the time. Deal with it.
        if type(disc_view) is list:
            disc_view = disc_view[0]
        for participant in disc_view['participants']:
            sid = participant['id']
            # Check each participant to see if they're in the roster, then
            #  add the topic id to their list of participated-in discussions.
            #  It's possible to have participants who are not students or who
            #  are no longer enrolled/active in the course, so we want to limit
            #  this to the roster we built earlier.
            if sid in students_part:
                debug('skipping discussion participant:\n'+json.dumps(participant))
        due_date = due_dates[topicid]
        for entry in disc_view['view']:
            # Parse the entry's submission date/time and compare it to the due date.
            #  If it's late, add the topic id to the student's late list.
            created = dtparse(entry['created_at'])
            if 'user_id' not in entry:
            author = entry['user_id']
            if created > due_date:
    # Make a nice dict from the original student data (a list).
    roster = {sid:tuple((name, sname)) for sid, name, sname in students}
    # For convenience, store the total number of assignments
    tot = float(len(disc_assigns))
    # Sort the list of students on the sortable name and keep their ids in said order.
    ordered = tuple(sid for _, sid in sorted((sname, sid) for sid, _, sname in students))
    # Print the goods.
    for sid in ordered:
        part = students_part[sid]
        late = students_late[sid]
                sid, roster[sid][0], 
                len(part), len(late), 
                len(part)/tot, len(late)/float(len(part)),
    return 0

I can then run the script and pipe the output through column to make it look nice (and, as in the example below, use grep to select only the line for a particular student):

python -m cnvv participation 12345 | column -ts $'\t' | grep 'Alice'

You get output like this:

12340000000012345  Alice Jones    50  0   100.00%  0.00%   50.00

Sure beats doing it by hand.