Iterator exercises – Iterables and iterators tutorial

Iterable or not?

For each object below, answer:

Is it an iterable, an iterator, both, or neither? Can it be used in a for loop? Will it work with next?

Reusable or not?

Before running the code, predict the output, write down what you think will be printed.

Why these two pieces of code behave in a different way?

What will this code print and why?

Even numbers

Write a generator function called even_numbers(n) that yields even numbers from 0 up to n (inclusive).

The fibonacci generator

Create an infinite fibonacci sequence generator.

DNA sequence generator

Create a generator of random DNA sequences.

List vs generator

Rewrite this code using a generator expression:

Use zip to pair lists

Use the zip function to pair the values in two lists, and using a for loop print the name of the sample along with its gc_content.

Using zip and dict create a dictionary from the same two lists with the names of the samples as keys.

Divide the VCF parser

We are parsing a VCF file and we want to get the variants and for each one we want a numeric index. This is our first attempt:

This code works but it has several issues.

It is creating a list in memory with all the variants.
The read_and_index_variants is doing two tasks: parsing the variants and indexing them.

It would be better to create a generator that only parses the vcf file and then to index the variants outside that generator. Moreover, by using a generator we would avoid using too much memory.

Count the number of variants per chromosome

Use the parser created in the last exercise count the number of variants per chromosome.

Fix the bug

The following code is supposed to compute the mean of some values. What is wrong with this code? Fix it without changing the input data.

Write a fasta file parser that yields one sequence at a time

Filter short sequences and calc GC content

Filter the sequences generated by the fasta parser, remove the ones with the length below a threshold, then calculate the mean GC content of the longer ones.

Sliding Window (The k-mer Generator)

In bioinformatics, we often analyze “k-mers” (subsequences of length k). Write a generator function generate_kmers(sequence, k) that takes a DNA string and an integer k and yields every possible k-mer as you slide along the sequence. Using to analyze a whole fasta file and print the final kmer count.