Table of Contents
MY PROJECT -- mode: org --
1. ML/DL
1.1. Interesting transformers
Sentence-T5 https://arxiv.org/pdf/2108.08877.pdf
1.2. How to run GPT et c
from transformers import GPT2LMHeadModel from transformers import GPT2Tokenizer model = GPT2LMHeadModel.from_pretrained("gpt2").cuda() tokenizer = GPT2Tokenizer.from_pretrained("gpt2") prompt = "The difference between baroque and renaissance art according to art historians is" input_ids = tokenizer.encode(prompt, return_tensors='pt').to(model.device) generated_text_samples_ids = model.generate( input_ids, max_length=64, num_return_sequences=5, num_beams=10, do_sample=True, no_repeat_ngram_size=2, ) generated_text_samples = list(tokenizer.batch_decode(generated_text_samples_ids)) for generated_text in generated_text_samples: print("GENERATED TEXT:") print(generated_text.replace("\n", " ")) print("")
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation. GENERATED TEXT: The difference between baroque and renaissance art according to art historians is that it is a form of art that has been around for a long time. Baroques and Renaissance art are not the same thing. They are two different things, both of which have their roots in the late 19th century. Bar GENERATED TEXT: The difference between baroque and renaissance art according to art historians is that the Renaissance was a period in which the art of painting, sculpture, and sculpture were becoming more and more popular. In the 19th and early 20th centuries, there were two major trends in art: the rise and fall of the GENERATED TEXT: The difference between baroque and renaissance art according to art historians is that the Renaissance was a time when the art of painting was becoming more and more popular. In the 19th century, there were two main styles of art: painting and sculpture. Painting was the most popular form of artistic expression, while sculpture GENERATED TEXT: The difference between baroque and renaissance art according to art historians is that Renaissance art is more of a work of art than it is of architecture. In the 17th century, the city of Barcelona was one of the most important centers of artistic expression in the world. It was home to many famous artists, GENERATED TEXT: The difference between baroque and renaissance art according to art historians is that in the early 19th century, the art of Renaissance art was in decline. In the 20th and 21st centuries, however, it was gaining strength. In the 1920s and 1930s, there was a revival of art in
GPT Neo
from transformers import GPTNeoForCausalLM, GPT2Tokenizer model_name = "EleutherAI/gpt-neo-2.7B" model = GPTNeoForCausalLM.from_pretrained(model_name).cuda() tokenizer = GPT2Tokenizer.from_pretrained(model_name) prompt = "The difference between baroque and renaissance art according to art historians is" input_ids = tokenizer.encode(prompt, return_tensors='pt').to(model.device) generated_text_samples_ids = model.generate( input_ids, max_length=64, num_return_sequences=5, num_beams=10, do_sample=True, no_repeat_ngram_size=2, ) generated_text_samples = list(tokenizer.batch_decode(generated_text_samples_ids)) for generated_text in generated_text_samples: print("GENERATED TEXT:") print(generated_text.replace("\n", " ")) print("")
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation. GENERATED TEXT: The difference between baroque and renaissance art according to art historians is that the former is a style of art that emerged in the sixteenth and seventeenth centuries, while the latter is an art style that has been around since the time of the ancient Greeks and Romans. The Renaissance is generally considered to have begun GENERATED TEXT: The difference between baroque and renaissance art according to art historians is that the former is a product of the Enlightenment, while the latter is the result of a counter-Enlightenment. In other words, Renaissance art was a reaction against the excesses of modernity, and it sought to return to the GENERATED TEXT: The difference between baroque and renaissance art according to art historians is that the former is the art of the people, while the latter is art for the elite. This is not to say that Renaissance art was not beautiful, but rather that it was intended to be beautiful for a select few. For example, GENERATED TEXT: The difference between baroque and renaissance art according to art historians is that the former is a product of the Middle Ages, while the latter is based on the Renaissance. The Renaissance was a period in the history of Western art that lasted from the 15th to the 17th century. It was marked by the GENERATED TEXT: The difference between baroque and renaissance art according to art historians is that the former is a product of the Middle Ages, while the latter is an outgrowth of modernity. In other words, there is no such thing as a “rebirth of art” in the sense of a return to
1.3. Vision + text models
Simple interfaces
1.3.1. CLIP - huggingface + sentence transformers
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("clip-ViT-B-16")
# model.encode works just like for texts but with PIL images
2. IR
import pytrec_eval import json qrel = { 'q1': { 'd1': 0, 'd2': 1, 'd3': 0, }, 'q2': { 'd2': 1, 'd3': 1, }, } run = { 'q1': { 'd1': 1.0, 'd2': 0.0, 'd3': 1.5, }, 'q2': { 'd1': 1.5, 'd2': 0.2, 'd3': 0.5, } } evaluator = pytrec_eval.RelevanceEvaluator( qrel, {'map', 'ndcg'}) print(json.dumps(evaluator.evaluate(run), indent=1))
{
"q1": {
"map": 0.3333333333333333,
"ndcg": 0.5
},
"q2": {
"map": 0.5833333333333333,
"ndcg": 0.6934264036172708
}
}
3. Tooling
3.1. Ploomber
3.1.1. Resources
3.1.2. Basics
Config in pipeline.yaml
tasks/tasks1.py contains functions
def subtask1(product, upstream): ... out.to_csv(str(product))
tasks:
- source: tasks.task1.subtask1
product: # this can be either one value which is a string, path to output
output: task1_out.csv
nb: products/report1.ipynb # we can also output file run as executed output
3.1.3. Hyperparameters
tasks:
- source: tasks.model.fit
product: ...
grid:
- model:
sklearn.ensemble.RandomForestClassifier
3.1.4. Partial run
ploomber build –partially "<TASKNAME>" –skip-upstream
3.2. Elasticsearch
#+BEGINSRC python :results output :exports both import elasticsearch
client = elasticsearch.Elasticsearch() client.indices.getalias() #+ENDEXAMPLE