Commit 8aaaa572 authored by christian.foerster's avatar christian.foerster

still work in progress but getting there

parent cbef1e07
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# %line_magic"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# %%cell_magic (must be in the first line of the cell)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# list the help to magix commands\n",
"%magic"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# list content of current folder\n",
"%ls"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%lsmagic"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%file test_simple.py\n",
"# writing content of cell to file\n",
"\n",
"def test_one():\n",
" assert 1 + 1 == 2\n",
" \n",
"def test_two():\n",
" \"\"\" this test will fail !\"\"\"\n",
" a = [1, 2, 3]\n",
" b = [3, 4]\n",
" assert a + b == [1, 2, 3, 4]\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# get current working directory\n",
"%pwd\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# return the history of a notebook\n",
"%hist"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# pip install in notebook\n",
"%pip install numpy"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# run a python script from inside a notebook\n",
"%run test_simple.py"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# run shell commands and capture output!\n",
"%sx "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# time your code\n",
"%time some_list=[a+b**2 for a,b in enumerate(range(1000000))]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# time your code - more suffisticated\n",
"%timeit some_list=[a+b**2 for a,b in enumerate(range(1000000))]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"a=\"22\"\n",
"b=33\n",
"\n",
"# list varaibles of certain type quickly\n",
"%who function str"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"There's a lot more o discover... convert cells to html, run javascript in cells, render as svg, ..."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Numpy\n",
"\n",
"For all of you that love matlab numpy will be very familiar. (https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html)\n",
"\n",
"Why use numpy? Because it's very fast! (https://stackoverflow.com/questions/7596612/benchmarking-python-vs-c-using-blas-and-numpy)\n",
"\n",
"**A more comprehensive tutorial:** (https://docs.scipy.org/doc/numpy/user/quickstart.html)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Arrays Examples"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"arr = np.array([3,4,5,6])\n",
"rng = np.arange(100)\n",
"ones = np.ones(1000000)\n",
"zeros = np.zeros((100,100)) # 2d array shape passed as tuple\n",
"repeat = np.repeat(2,1000)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Shapes"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"repeat.shape"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 3d array\n",
"repeat.reshape((10,10,10))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# flatten array to 1d\n",
"repeat.flatten()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Indexing and Slicing"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"rng[:5]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"rng[79]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"rng[3::5]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"rng_2d = rng.reshape((10,10))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# all lines, starting with column 5\n",
"rng_2d[:,4:] "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# first 5 lines, all column\n",
"rng_2d[:-5,:] "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"rng_2d[3:5,6:9]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Boolean Indexing"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 2d boolean array\n",
"boolean_arr = rng_2d > 45\n",
"boolean_arr"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# losing shape of course!\n",
"rng_2d[boolean_arr]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"rng_2d[rng_2d[:,0] > 40, 0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"rng = np.arange(10000000)\n",
"# symbols: & and, | or, ~ not\n",
"%timeit rng[(rng<90) & (rng>67)] # \n",
"%timeit rng[np.logical_and(rng<90,rng>67)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Performance Comparison"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"vector_len = 10000000"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"square = []\n",
"for i in range(vector_len):\n",
" square.append(i**2)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"square_np = np.square(np.arange(vector_len))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"square_even = []\n",
"for i in range(vector_len):\n",
" i_2 = i**2\n",
" if i_2 < (vector_len / 2) ** 2:\n",
" square_even.append(i_2)\n",
" else:\n",
" pass"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"square_even_np = np.square(np.arange(vector_len))\n",
"square_even_np = square_even_np[square_even_np < (vector_len / 2) ** 2]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
This source diff could not be displayed because it is too large. You can view the blob instead.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -3,8 +3,8 @@
The tutorial consists of multiple parts.
- Python Basics
- Python Advanced
- Pandas and Plotting
- Python Pandas, Numpy, Plotting, Statsmodels
- Packaging
## Requierements
......@@ -79,7 +79,7 @@ We'll be using:
To install the packages open a console and run the following command.
(Linux users should use pip3 instead of pip since they do have Python2 and Python3 installed by default.)
```bash
pip install pandas numpy matplotlib plotly cufflinks seaborn sklearn jupyter notebook
pip install pandas numpy matplotlib cufflinks seaborn sklearn jupyter notebook
```
They're many dependencies involved, so the installation might take a minute.
......
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**1. Create a function that prints a list without commas or brackets.**\n",
"\n",
"Input: [1,2,3,4] \n",
"Output: 1 2 3 4 "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**2. Create a function that negates all values of a boolean list.**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**3. Choose a name randomly. (Use the random modul)** \n",
"Namen=[\"Karl\",\"Lisa\",\"Sven\",\"Birgit\",\"Igor\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**4. Create a prime number generator!**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**5. Define your own sorting algorithm!** \n",
"Input is a list!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**6. Define your own algorithm to get the max value of a list!**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**7. Write a class called 'FileAnalyser'.** \n",
"\n",
"When initiated it gets a filepath to a file you want to analyse. \n",
"The class should have 2 attributes, 'path' and 'name'. \n",
"Write at least 2 methods for that class, that returns you clever stats about the file like: \n",
"\n",
" - size (with os package)\n",
" - line number\n",
" - total words (by whitespace and newline separated string patterns)\n",
" - contains_numbers \n",
" - contains_uppercase\n",
" - contains_lowercase\n",
" - ...\n",
" \n",
"(This would be at least 6 methods now)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
line1
line 2
line 3
some other info
and so
on
and
so on
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Solution to Basic"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**1. Create a function that prints a list without commas or brackets.**\n",
"\n",
"Input: [1,2,3,4] \n",
"Output: 1 2 3 4 "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def list_formatter(lst):\n",
" return \" \".join([str(i) for i in lst])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**2. Create a function that negates all values of a boolean list.**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def negate(lst):\n",
" return [not val for val in lst]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**3. Choose a name randomly. (Use the random modul)** \n",
"Namen=[\"Karl\",\"Lisa\",\"Sven\",\"Birgit\",\"Igor\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import random\n",
"lst = [\"Karl\",\"Lisa\",\"Sven\",\"Birgit\",\"Igor\"]\n",
"random.choice(lst)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**4. Create a prime number generator!**\n",
"\n",
"Thats a pretty slow version"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def prime_gen(limit):\n",
" for val in range(limit + 1): \n",
" if val > 1: \n",
" prime = True\n",
" for n in range(2, val): \n",
" if (val % n) == 0: \n",
" prime = False\n",
" break\n",
" if prime:\n",
" yield val\n"