Introduction to TEI and XML

In our second session (April 27th), we were introduced to TEI and XML. First, we learned that TEI is an acronym for Text Encoding Initiative, which is used to create data from scratch, store data and transform data in machine/computer-readable formats. The type of information stored is varied: not only texts but also audios, pictures, and videos can be stored in digital form. XML is an acronym for Extensive Markup Language, a descriptive computer language that uses symbols to create a clear structure in a text document. TEI and XML are essential tools because they allow information, in our case Konkomba folktales, to be easily accessible for those interested in the stored data while simultaneously preserving it so it will not be lost in the future.

Next, we were shown how a TEI file is structured. A TEI file always contains a header and a text also referred to as a body. There are also main containers and sub-containers (short: sc) which contain elements. Their purpose is to define the Markup Language. For example, when coding we use an Open Tag (<), then we insert the elements/sc, and then use a Close Tag (>). We have to keep in mind that Tags are a very integral part of Markup Language. Without Tags a TEI file cannot be properly formed, resulting in an error. At the beginning of every TEI file, we need to type <TEI…> before we begin with the header and the text/body. Only when we are completely finished with creating a TEI file, we can type </TEI>. This is the first step.

The second step is to create a header. The header needs to be tagged as <teiHeader>. There, the main container is situated. It contains metadata, like the author, storyteller, publication information, editors, sponsors, etc. This is mandatory information that is typed as follows: <fileDesc>… </file Desc> (file Description), <titleStmt></titleStmt> (title statement), <publicationStmt></publicationStmt> (publication statement), and <sourceDesc></sourceDesc (source description). When all the mandatory information is in the TEI file, we use </fileDesc>. There are also optional elements, like <encodingDesc> (encoding description) which details editorial decisions or the relationship between a text and the source from which it was derived. When the header is completed, we use </teiHeader>.

We were shown an example of how the header is structured:

After that, the text is tagged as <text>. The text contains the body (tagged as <body></body>). First, we need to put the title or subtitle: This is done by putting <head>… </head>. To markup sections within a text, we need to use <div>… </div>. To create paragraphs, <p>…</p> is used. Finally, to encode line breaks or quotations we use <l>…</l> and <q>…</q>. When we are finished with the body, </body> needs to be put at the end. To finish the text section, we type </text>.

This was another example shown to us of how to successfully create a text/body section:

To conclude, I think the introductory lesson about TEI and XML has been easy to follow and was explained in a way that wasn’t too complicated. At first, I was a bit nervous about learning how to code because I don’t have a lot of experience with coding but after this session, I am interested to learn more about TEI and how we are going to implement coding in our future sessions of the seminar.

Folk tales, Coding and Old Ladies and their Cats

Introduction

Digitalising folk tales from cultures and areas deeply rooted in oral traditions and orality helps archive these tales, as well as the languages and cultures they derive from, and thereby gives them the opportunity to transcend time and space to reach more people. One might of course ask whether translating folk tales from their original marginalised language into an institutional, widely spoken language such as English means domesticating this language and therefore ultimately contributing to that language’s endangerment. However, as we have also concluded in class this winter term, if the translator uses the method of foreignization for their translation, meaning that they give visibility to the original language and culture (cf. Venuti), it can also help an endangered language be preserved. Translating and showcasing the Konkomba language is rather a case of preserving and archiving the existence of the language and culture, because it is in danger of going extinct in the near future. Therefore, translating (with the method of foreignization) and subtitling the Konkomba folk tales in English gives visibility to the language and culture, because by hearing the folk tales in their original language and following along with the English subtitles helps a non-Konkomba listener become familiar with the language as well as the culture.

The folk tale

A folk tale is a tool of language and culture documentation, transmission, and preservation. In a traditional sense, folk tales are oral stories that are passed on from generation to generation, but it became more common for them to also be written down over the years (cf. Thompson 4). However, there are cultures and areas around the world where orality and oral folk tales dominate and hence these stories do not necessarily exist in a written form. 

A folk tale is region-specific and always expresses, communicates, and transmits beliefs, values, a morale, or myths among other things of its culture of origin. It is mostly quite short and changes are predetermined with each re-telling because of its oral nature (cf. Pullum 96). The folk tale, its language and culture are sources of indigenous knowledge about a people’s history, cultural heritage and belief system and are hence deeply interwoven with each other – as explained by our course instructor Tasun Tidorchibe.

The Konkomba folk tale “A Cat Saves an Old Lady from a Troublesome Wolf”, gives insight into the reason why most old ladies are fond of cats, with its roots deriving from Konkomba mythology. The folk tale is about an old lady – an upininkpil – and how a wolf always steals the food she is making. Whenever the old lady makes yam and pounds fufu (a type of mash) she sings a song. Hearing the song, the wolf approaches, thereby answering upininkpil’s song with one of its own. This causes the old lady to run away in fear. Because the wolf always comes when she is making food, she has tried to seek protection from other animals, but without success. However, one day a cat visits the old lady and offers to capture the wolf for her. The cat tells her to do everything as usual and the old lady agrees. The wolf comes, starts singing its song and enters the old lady’s house. But at that moment the cat jumps on the wolf and kills it, saving the old lady from her predicament. The cat chose to save the old lady instead of a fellow animal. This is believed to be the reason why many upininkpils keep cats as their pets.

Process: Coding and Video Editing

After receiving the folk tale, I began coding in Visual Studio Code. I started with the header and put in all the required information: the title of the folk tale, the author, storyteller, editor, date, and place among others. It was a bit difficult, however, to find the exact geographical location of Chakping, the village in which the recording had taken place. After I had completed the header, I started encoding the folk tale. As we practiced this a lot in class, I encountered few problems while encoding the story. The only thing I had to remember though was to use italics. Encoding the song included in the folk tale was a bit more challenging, as there were multiple song sections, so I had to use a lot of tables and division elements to go back and forth between the narration and the song section of the folk tale. Lastly, I encoded the notes and the glossary. As we talked about this a lot during term as well and even changed our approach on how to do it, it went quite smoothly. I used <gloss xml:id> in the glossary and <term ref> in the story, because xml:id always needs to be unique and hence cannot be used in the code if a term comes up multiple times in the folk tale. At the end, after I had finished the coding, I checked what it looked like by converting the TEI document into a PDF via TEIGarage.

Then, I started with the subtitling of the video. I used the software SubTitle Edit one of my classmates recommended, as it is a lot easier to use than Davinci Resolve’s subtitle function. Because I used this software, I was able to work rather quickly on the subtitles, as I only had to copy and paste the subtitle text into the software and adjust the timestamps for each subtitle sequence after having set the settings to the correct amount of characters per line. Thanks to our instructor Tasun Tidorchibe, who provided me with the timestamps, I had no real issue with this process. The only thing that slowed me down a bit was the fact, that I had to use my family’s old laptop because the software did not work on my own computer. I exported the subtitles and then went on to work in Davinci Resolve for the video-editing.

Davinci Resolve, similar to my encounter with it during term, was a bit of a struggle once again. Not only did the software almost shut down when I was nearly done, but I also encountered a problem with the display of the video in the software itself, as the image was suddenly gone and even though I somehow managed to get it back, the size was off. Thankfully, nothing major happened and I was able to edit and finish without any damage to the video itself. During the video-editing process, I had to adjust some of the subtitle timestamps for them to align with the storyteller’s speech, as well as lengthen them a bit because some were just too short at first. I also added some subtitles for background noises and adjusted the overall look of the subtitles for them to be easier on the eyes. I had to render the video twice because I wanted to change some things after having a look at my first draft. Because I had issues with editing title, credits and the copyright in Davinci Resolve for the folk tale I worked on in class, I decided to use my computer’s own video-editing software for the finishing touches since I am more familiar with its handling.

So lastly, I used iMovie to add a title page, the credits and the CTS logo for the copyright. I was not able to put the logo into the video for the folk tale I worked on in class, because Davinci Resolve did not let me. With iMovie it thankfully worked, so I put the logo into the upper right corner of the video. With that I was able to finish the video-editing part without any further struggles and also completed my task in digitalising the folk tale. 

Conclusion

With digitalising this Konkomba folk tale, not only is a version of the folk tale itself but also the culture and language it derives from preserved. As the Konkomba people are a minority culture and their folk tales are hence minority oratures, by not only translating but also digitalising and therefore preserving them, this folk tale, its story and its orality have been transported and archived (cf. Bandia 111). Contributing to this project really opened my eyes for the beauty of folk tales once again. When I was little, I loved fairy tales, folk tales and mythology from different cultures, but I did not keep up my interest in them all too much the older I got. Additionally, I also got to know a culture I had no previous knowledge about, which also shows the power folk tales hold in terms of transmitting and communicating more than just a story. I got to improve my coding and video-editing skills as well, which I had not utilised in quite a while. So all in all, this was a very enriching experience and now I know why old ladies like to keep cats as their pets – and this possibly not only in the Konkomba culture. 


Secondary literature

  • Bandia, Paul. “Orality and translation”. Handbook of Translation Studies Volume 2, edited by Yves Gambier and Luc van Doorslaer, John Benjamins Publishing Company, 2011, pp. 108-112.
  • Pullum, Tracie. “Promoting Writing with Folktales.” The English Journal, vol. 87, no. 2, 1998, pp. 96-97.
  • Thompson, Stith. The Folktale. 1946. Berkley and Los Angeles, University of California Press, 1977.
  • Venuti, Lawrence. The Translator’s Invisibility. London, Routledge, 1995.

Last Session: Presentations

For our last session on Thursday, 26th January, we were tasked with presenting the projects that we had been working on throughout the semester. We discussed the coding and video editing process and experience of the folktales of the Konkomba culture, while focusing on the various difficulties we faced and how to avoid mistakes. In the last blog post you can read about our experience and problems while encoding the folktales. This is why I am going to talk about our video editing experience in the following.

Video editing problems 

This was our first experience with video editing and using the video editing programme Davinci Resolve. Therefore, many problems and difficulties occurred while editing the videos and adding the subtitles. In the following, I’ll be sharing the problems we faced and our tips to avoid these mistakes.

If you are working in a group, it is easier to use the same video editing programme. This will avoid issues when merging different parts of the video, which we had difficulties with. Another issue was the length of the subtitles, as ours were often way too long. Therefore, subtitles should not appear for more than eight seconds and each line should not exceed 40 characters.  Additionally, you should always add a title and copyright and it is easier to add the title before adding the subtitles. Using the pyramid form for subtitles with two lines will give a more clear structure.  Furthermore, it is important to subtitle everything and check the timing of the subtitles to ensure they match the content. In the end, when exporting the video, don’t forget to export the subtitles and burn them into the video.

A couple concluding words

Video editing can be a challenging task, especially for those who are completely new to it and have never used a video editing programme, such as Davinci Resolve. However, with some practice and the previous mentioned tips, it will be easier to edit a video the next time.  The course “Demarginalising Orature — Translating Minor Forms Into the Digital Age“ provided us with the opportunity to learn new skills in video editing and coding. However, what was very special about this course was that we learned about the Konkomba culture and its folktales. 

Group Work: Final Edits

During the session on the 19th of January, we worked on the finer details of our TEI files in Visual Studio Code.  Because each group received a piece of paper that listed their errors, we spent the session correcting them.  

Examples of Errors:

The gravest mistake my group made while encoding the stories was that we did not format the glossary and notes correctly.

code for the glossary in the story section
code for the glossary in the glossary section

The example from the pictures is our solution. We had previously mixed these two up and used the target tag in the glossary and the xml:id in the text itself. Sadly, this did not work in the same manner as the correct solution. Funnily enough, we managed to get it right in one of the stories we encoded but not the other one. I am still unsure how exactly we managed to do that.

We were also experiencing some difficulties while adding a link to the code. We were supposed to provide a link to the subtitled video of the story, however, we were unsure where exactly in the introduction it was supposed to appear. As could have been expected, we chose the wrong place, as the link is the last element of the introduction of each story. The picture below shows how a link can be added to a code.

link to the video in the code

While we were coding the link into the xml file, we also managed to make another mistake. We were so focused on trying to add this new element that we completely overlooked our spelling. In this case, we forgot the full stop at the end of the introductory paragraph after “click here”. Additionally, we somehow forgot to change one quotation mark in the story itself. As they are not allowed in this file, you need to change every quotation mark to the tag <q> when the quote begins and </q> when it ends. This can be seen in this example.

direct quote from the story

Lastly, we did not know that we each needed a name tag in the copyright information. We put both of our names into one name tag but changed it during this session.

Summary:

All in all, these were mistakes we made because we were not careful enough while we were coding. Many of them could have been avoided if we had read through our code a little more thoroughly; however, now we know what our mistakes were and we will be more careful in the future.

This session was also the last one we had before we held our presentations, so many groups already started to talk about how to organise this.

Introduction to video editing and subtitling

Subtitle Example: Bilinyi Chikpaab James narrates “Nachiin Pays for Feasting on Unyii’s Children” / Source: HHU Mediathek

Hello everyone!
In this post I’m going to tell you a little bit about our last “Demarginalising Orature” session. As you may have guessed from the title, we talked about and worked on video editing and especially subtitling. In the past few weeks we have learned about Konkomba folktales, language and culture, we have worked with some of the folktales by encoding them using TEI. And now the next step is editing videos of Konkomba people narrating the folktales. Ultimately, you will find them in the HHU Mediathek.

Our last session

So, what happened in our seminar? Firstly, our tutor Jana gave a presentation, introducing us to a video editing program called DaVince Resolve (DVR). She also introduced us to some of the basics of subtitling. E.g. the length of a subtitle, which should be no more that 30 characters per second. The ideal length is 15-20 CPS but as Jana pointed out, this is quite difficult to achieve. Futhermore, a subtitle should always start synchronously with the speech (defining a subtitle’s start and stop point is called spotting). If the subtitle comprises 2 lines, it should be presented in pyramid form, so the upper sentence should ideally be shorter than the lower one. There are many more rules and conventions regarding subtitling but naming them all would go beyond the scope of this blog entry.

DaVinci Resolve and SubtitleEdit

“Edit” page in DaVince Resolve / Source: https://www.blackmagicdesign.com/products/davinciresolve

Then, a fellow student, Lisa, also gave a short presentation on DVR and also introduced us to another program. According to both Jana and Lisa, DVR can be a bit difficult to work with, especially in the beginning. But luckily Lisa is familiar with another subtitling software, which she introduced to us as well. It is called SubtitleEdit. You can find a very useful step by step tutorial for DVR and SubtitleEdit in her blog entry. Some “fun facts”: According to their website, DaVinci Resolve is Hollywood’s #1 post solution. Apparently, many films and TV-shows are edited in DVR. It was first released in 2004. SubtitleEdit, on the other hand, is a free open-source subtitle editor.

SubtitleEdit interface / Screenshot by Lisa

Conclusion

To sum it up, in our last session we learned about subtitles and subtiling tools. During our session, DaVinci Resolve made my laptop crash and there were definitely some initial difficulties. But SubtitleEdit is a bit more beginner-friendly and in the end we will manage to subtitle all our folktale videos, I am sure! Yet another step to conserving orality and making Konkomba folktales accessible to a broader audience!

Group Work: Encoding folktales

In today’s session we did a presentation of our group work: Every group of two to three people encoded a folktale into TEI. We shared our experience with encoding itself, issues that occurred while working on the stories, and problems we had with the program Studio Visual Code.

Issues while encoding

The groups used different approaches to highlight the Likpakpaln terms: some just tagged them with <term>, others additionally highlighted them as superscript. There were also struggles with placing the end-tags at the right spot, but Jana revised our TEI-documents and made us aware of and helped with our issues and mistakes.

Livia and I had problems with the Live Sharing within Studio Visual Code, but worked our way around it. Our group also highlighted the headers with <hi rend=”bold”>, but Jana reminded us of the fact that by using the <head>-tag alone around the header it will already be visualized in bold writing, so after revising we took out the <hi>-tags to not overcrowd the document unnecessarily. Jana also pointed out that there was a little inconsistency in our xml:ids, since we had a little mix-up when tagging the terms with the appropriate ID.

One way to appoint an ID in the glossary …
… and referencing it in the text.

There was a slight confusion in class about when to use the xml:id and the target attributes, since an ID must be unique within a document. Livia and I tried the solution of using the xml:id attributes within the <gloss>-tags in the glossary and referencing the IDs by using <term target=”#term-id”> around the terms within the text. As it turns out, this is working, so we were quite happy with finding a solution.

New folktale, new issues

Moving on, or rather continue practicing, the groups chose new folktales to work on. We were instructed to take on a story that contains a song, so we could practice the use of tables in a TEI-document for presenting the original Likpakpaln songtext next to its translation. The <table>-element is a tricky one, because you need to build a table with its rows and columns, which can be very hard to envision, when there is no spreadsheet in front of you, but instead something like this:

The first cell of a row always contains the line in Likpakpaln, the second the English translation.

Unfortunately (or luckily?), the folktale Livia and I chose contained a very simple song that only consisted of names and so it didn’t need a translation, ergo no table. Instead, we used the <l>-tags – l standing for ‘line’ – for each row.

The instructions in parenthesis might create a new problem:
Should they be part of the song division or outside of it?

By encoding various folktales, I think all of us realized that TEI and XML are a bit complicated, but actually very logical in their use. Although it seemed abstruse and confusing when learning about the tags and attributes in the beginning, everything makes sense when practically working with it. Encoding is definitely a practice that needs a lot of exercise and revision to understand it. And our work within the sessions really helps here by applying the universally known phrase: learning by doing!

The Homestretch of our TEI Introduction

Last week we finished our introduction to TEI and started our group work of this semester.


TEI Introduction III

For the TEI part of the class we dealt with common mishaps that occurred in our TEI documents of the folktale “Why the Python’s Skin has Dark-Brown Blotches” which we worked on the week before. None were major mishaps, but they are still parts of the code that are important for the document to come together. These mishaps included: forgetting <head type=”subTitle”> to indicate subtitles in the document, closing divisions too soon, and – which wasn’t really a mishap at all – that we don’t need to use the <q>-tag anymore if we use a division for ‘song’.

Then we talked about how best to encode notes and glossaries by using a <list>-tag.

An example for <list>.

Another thing before we started with our group work was, that we talked about the issue that XML:IDs need to be unique, meaning that they can only be used once in the whole document, which proves difficult, if we want to ID the same term throughout a folktale. The work-around we decided on for this problem is that we will only ID the first instance a term comes up in a folktale, and only that one time. This also works great with our aim to foreignize the folktale for its readers, as only having an explanation for the first time an unknown term comes up means that the reader will have to engage with a folktale on a close level to understand it completely.


Group WOrk

And lastly for last week’s class we got together in our groups, decided on a folktale to work on, and started with that. Working on our own folktales was really doable thanks to the introduction to TEI the previous three weeks, and therefore I want to thank Jana and Tasun again for providing us with so much in-class information and answering our questions!

[Addendum] Folktales, language and culture

Introduction

Hello everyone! This blog entry was meant to be published a while ago – sorry for the delay! Luckily, Anne also published a blog entry on that lesson of the “Demarginalising Orature – Translating minor forms into the digital age” seminar. I hope you have all read it, it was very informative and I can only add a few things.

Why the Wasp has a Tiny Waist

Something Tasun said in the seminar that stuck with me was how folktales in general have multiple important functions. In fact, they do not only have entertainment qualities but also contain moral lessons and are used as pedagogical tools. They can teach the values and customs of a specific culture. One example for this would be the folktale “Why the Wasp has a Tiny Waist”, which we have also talked about in the seminar.

Conclusion

To sum it up, the lesson successfully conveyed the cultural importance of Konkomba folktales and the Likpakpaln language. Therefore, it is important to preserve them (e.g. by making them accessible to a broader audience). Hopefully, this blog entry can play a tiny part in this.

Coding as a Humanist -Encoding a Konkomba Folktale in XML

As part of the course “Demarginalising orature – Translating minor forms into the digital age” our goal was to transcribe and digitalize some of the folktales that are shared among the Konkomba people as an oral tradition to make them available to a broader audience. That their culture is based on oral traditions, has to be kept in mind, when we digitalized their history. The dominant written language was a result of colonization, but the tradition of orality prevails. The telling of the tales is part of the Konkomba’s daily lives. They gather in the middle of their village either during communal work, to entertain each other or in the evening as a way of spending time together and bringing the tales closer to the younger generation, so that they wouldn’t be forgotten. The way these tales are told is very important. The storytellers interact with the audience through questions, gestures or emotions.
Because of changes regarding the community’s lives and work structure, these moments tend to be rather scarce now, as we were told in our introduction to the topic. Since very few tales had been written down, Tasun Tidorchibe visited the villages and recorded, transcribed and also translated the folktales into English. Our assignment was to turn the transcriptions of the folktales into a searchable PDF via TEI and produce a video with subtitles from the audio and video files given to us.

The folktale Lilli Bloch and I were working on is called “Nachiin Pays for Feasting on Unyii’s Children” and was narrated by Bilinyi Chikpaab in Kutol on 18th March 2022. The story is about a wolf, who deceives a crocodile in a vicious manner in order to eat all of her children. Since the crocodile is the guardian of the river, no one is allowed to drink from it, as a result of the actions of the wolf. But the rabbit, seen as the wisest character and trickster in Konkomba folktales, needed to drink and made a deal with the crocodile: the crocodile would get revenge on the wolf for eating all of her children and in return the rabbit would not get eaten. In order to bring the wolf to justice, the rabbit and crocodile outsmarted the wolf and built a trap which resulted in the wolf losing his testicles to the crocodile and the rabbit’s life being spared.

Since the provided audio is in Likpakpaln, the language spoken by the Konkomba people, Lilli Bloch equipped the video with English subtitles, while I worked on the PDF, using TEI and XML. I started with the Tei Header. I used the header we created in class as an orientation and adapted it to fit this particular requirement. Secondly, I copied and pasted the story into the document, to have the base covered and to work from there. I put the story in paragraphs, so it is easier to read. Then I marked the words that we wanted to use in the glossary, so an explanation is available through the glossary. Since there are some words or phrases that don’t have a suitable translation or could only be exchanged for a lengthy description or explanation, they are kept in Likpakpaln in the text, but have the explanation at the bottom of the document in a glossary. This is an easy way to display an explanation for terms that cannot be translated from foreign languages. But for readability, I put in three footnotes at the beginning with only the translation of the animals, for clarity regarding their species, so the reader can picture them and have a better understanding of the story.  We decided to put the longer explanations in a glossary at the end, so that the pages in the document would not be crowded, but we didn’t want to exclude them, since they are a way of teaching background knowledge and offering context for the reader about the Konkomba people and culture. We put the information about the footnotes also into the editorial declaration in the TEI document.

At first, I was a little bit sceptical, if it is possible to put two marks around a word, but luckily it worked out fine and it looks just like I wanted it in the PDF. The glossary and footnotes were the tricky part of the coding, since you have to think of a lot of terms, that have to be used in order to work properly in the converted file. We chose to put the explained term in a bold font and the explanation under it, so it has a tidier look to it.

In the TEI document
In the finished PDF

 After Lilli was done with the video, I inserted a link, given to us by our tutor Jana Mankau, into the document, which when clicked in the PDF leads to the repository, where the video with the subtitles is uploaded, in addition to provide a digitally readable transcription, the tale can also be listened to and the interactions of the storyteller Bilinyi Chikpaab with his audience can be observed. Once this was done, I uploaded the document on a platform called teigarage.tei-c.org, where I converted the document into a PDF.

#demarginalizingorature #TEI #coding #konkomba #folktales #culture #orality #digitalize #oraltradition