Annotating multilingual sentences in „America Is Not The Heart“ by Elaine Castillo

Working with Google Collab to do the sentence annotations wasn’t so scary to me because I was already familiar with some coding/programming and linguistics from my bachelor. However, I haven’t worked with both in combination like this before, so that was a new and interesting experience for me. I had no real problems working with the program, some difficulties only occured when trying to run multiple collab notebooks at the same time (I had to save and close some before moving on to new ones!).

I chose a few sentences from the novel „America Is Not The Heart“ by Elaine Castillo, some sentences are mostly English with a few Tagalog loanwords (which are all food-related items), some Tagalog sentences with some English words and some completely Tagalog sentences. From my first observations I could tell that the program had no idea how to deal with the Tagalog words or sentences. On a positive note, the program at least annotated the English parts correctly from what I could see.

As a visualization, I copied the sentence into an excel table with the annotations and dependency relations that the program provided. The yellow highlighted bit is my attempt at giving a slightly more correct version of the word categories with a rough translation. I did not attempt my own dependeny relations for lack of grammatical knowledge, but I would guess that it should look very different than the current one.

Either way i found it very interesting with how the program dealt with the sentences!

Dieser Beitrag wurde unter Allgemein, Blog posts, Student entries, Writing across Languages abgelegt und mit , verschlagwortet. Setze ein Lesezeichen auf den Permalink.

Eine Antwort zu Annotating multilingual sentences in „America Is Not The Heart“ by Elaine Castillo

  1. Anna Prickarz sagt:

    Dear Charmaine, Reading your blog entry, I realized I forgot to comment on the machine’s ability to annotate the English parts in my blog post and I agree – the annotations and dependency relations for English seemed plausible to me as well. I find quite interesting that for Tagalog, the machine categorized everything as proper noun, because this was not the case for all languages. For the Palestinian Arabic examples from the novel I chose, Susan Abulhawa’s “The Blue Between Sky and Water”, some words were labelled as proper nouns, but others as adjectives and adverbs – in all instances it was a miscategorization. As for the randomness of dependency relations, we seem to have had a similar experience: non-English words and phrases confused the machine and led to nonsensical interpretations of dependencies. Last but not least, I wish to point out that the idea to visualize the results via Excel is great! I had wanted to include some screenshots of dependencies, but since Google Colab makes it difficult to see an entire sentence at once, I refrained from it. Visualization via Excel clearly solves that issue.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert