atmelino

August 2, 2025

NLP transfer learning

Word Embeddings in NLP: An Introduction
https://hunterheidenreich.com/posts/intro-to-word-embeddings/

Distributional semantics
https://en.wikipedia.org/wiki/Distributional_semantics
distributional hypothesis: linguistic items with similar distributions have similar meanings.

word2vec:
Tool for computing continuous distributed representations of words.
https://code.google.com/archive/p/word2vec/
https://www.tensorflow.org/text/tutorials/word2vec
https://en.wikipedia.org/wiki/Word2vec
Preservation of semantic and syntactic relationships:

Get training data by extracting text from Wikipedia
https://mattmahoney.net/dc/textdata.html

Part 9.3: Transfer Learning for NLP with Keras

The code loads an pretrained embedding model as variable model.
Then it creates an embedding layer as a Keras layer with the parameters from this model, and that has an output shape of 20.

hub_layer = hub.KerasLayer(
model,
output_shape=[20],
input_shape=[],
dtype=tf.string,
trainable=True
)

The embedding layer can convert each of the reviews into a 20 number vector.
For example:

print(hub_layer(train_examples[:1]))

prints

tf.Tensor( [[
1.7657859 -3.882232 3.913424 -1.5557289 -3.3362343 -1.7357956
-1.9954445 1.298955 5.081597 -1.1041285 -2.0503852 -0.7267516
-0.6567596 0.24436145 -3.7208388 2.0954835 2.2969332 -2.0689783
-2.9489715 -1.1315986
]], shape=(1, 20), dtype=float32)

regardless of how many words the input has.

August 3, 2025

The model used in 9.3 has its desctription at
https://www.kaggle.com/models/google/gnews-swivel/code
"This module .. maps from text to 20-dimensional embedding vectors."

August 5, 2025

Word Embedding using Universal Sentence Encoder in Python
https://www.geeksforgeeks.org/python/word-embedding-using-universal-sentence-encoder-in-python/

How to load TF hub model from local system
https://stackoverflow.com/questions/60578801/how-to-load-tf-hub-model-from-local-system

https://xianbao-qian.medium.com/how-to-run-tf-hub-locally-without-internet-connection-4506b850a915

where does tensorflow_hub store model on ubuntu?

module_url ="https://www.kaggle.com/models/google/universal-sentence-encoder/tensorFlow2/universal-sentence-encoder/2?tfhub-redirect=true" print(hub.resolve(module_url))

prints

/tmp/tfhub_modules/3bdf4002a346590d64dd2aee920834f58917f372

https://www.tensorflow.org/hub/caching

Tutorial: Universal Sentence Encoder

https://www.tensorflow.org/hub/tutorials/semantic_similarity_with_tf_hub_universal_encoder

The tutorial compares similarity scores computed using sentence embeddings align with human judgements.
STS Benchmark
http://ixa2.si.ehu.es/stswiki

The benchmark dataset is downloaded:

sts_dataset = tf.keras.utils.get_file(
fname="Stsbenchmark.tar.gz",
origin="http://ixa2.si.ehu.es/stswiki/images/4/48/Stsbenchmark.tar.gz",
extract=True)

The folder location of the downloaded dataset is at

~/.keras/datasets/stsbenchmark

This folder contains a readme.txt.
"The benchmark comprises 8628 sentence pairs."

csv files
column headers:
genre filename year score sentence1 sentence2

August 30, 2025

Neural Style Transfer

The example im Jeff Heaton's class is based on the Keras Tutorial at
https://keras.io/examples/generative/neural_style_transfer/

More information on NST at
https://en.wikipedia.org/wiki/Neural_style_transfer

August 31, 2025

Gemma 3 270M language model runs on Just 0.5 GB RAM
https://medium.com/coding-nexus/googles-new-llm-runs-on-just-0-5-gb-ram-here-s-how-to-fine-tune-it-locally-ab910fa39732
https://apidog.com/blog/run-gemma-3-270m-locally/

AI learning blog August 2025