How I translated The Reckless Cricket is quite a lengthy answer. Before going into detail, I will put it simply. I used the Whisper (speech-to-text) function embedded within Subtitle Edit to provide a basis of Chinese text, which I manually checked, corrected, and then translated with the help of a variety of Cantonese dictionaries (listed below). Even simpler: Whisper threw the alley-oop, I dunked it. Typing it out in a single sentence makes it sound so much easier than it actually was, and whether or not I actually dunked it remains to be seen!
Also, yes, Whisper is an AI model, but as it’s just a program creating text from speech, I’m not sure it really qualifies as “AI.” Sure, it learns as it goes, but after working with it for 16 months, I don’t know how intelligent it really is. My point is that while Whisper opened up the possibility of doing this translation, my Natural Intelligence (NI) was a required part of the process. I did not just run Whisper and then plug the results into a translation program; I manually checked and translated everything myself. So before anyone dismisses this as an “AI translation,” I think, if anything, I actively proved that AI alone cannot handle something as delicate as this. This translation was achieved through the combination of AI and NI, with the mindset of a librarian using trusted resources to accurately deliver required information. I believe the power of the human mind is limitless if we set our sights on a goal and we use the correct tools.
For anyone checking out now: I’m sure I missed things. I’m sure there are still errors here and there in the translation. I believe it’s mostly correct, and I did my best to verify what I thought I heard against what Whisper heard. I won’t really know how successful I was until/if Cantonese speakers watch the movie. The main subtitle file includes stacked Spoken Cantonese and English subtitles, so it should be relatively easy to see where or how I went wrong. If you speak Cantonese and you see an error or something I missed, kindly let me know if you’d like. All the time and energy I put into this translation was out of pure love and admiration for Hong Kong, its people and their cinema, and I want to make sure that is evident right up front. I have loved Hong Kong movies for over 25 years now, and this endeavor is my strange way of trying to give back and preserve this forgotten film in its original Cantonese language.
Even though this was unlike anything I have ever attempted, like the 爛頭蟀 “Reckless Cricket,” I refused to give up, no matter what!
Now for the extremely long-winded version!
I. Why?
As I mentioned above in the quick version of “How?,” the general foundation of “Why?” comes from my love for Hong Kong, its people, and their cinema.
Initially, I just wanted to watch The Reckless Cricket with English subtitles, but it quickly blossomed into much more than that. The film is largely forgotten, but I believe all films should be available for anyone to see and form their own opinions, regardless of age or perceived worth. The opportunity to provide a translation for this film was exciting. As soon as I realized that I had the original Cantonese track, too, I knew I had to “preserve it” and attempt a direct Cantonese translation, instead of using the burned-in subs for the Mandarin dub. Ultimately, the two seem relatively close to one another, but there’s a flavor in the Cantonese that adds so much to the film (both in terms of language and the performances), in my opinion.
As a Hong Kong comedy, there’s not much interest for this film in the West. When I reviewed Crazy Nuts of Kung Fu I acquired the English subs from a fan, and I inquired about Reckless Cricket. He said there’s no reason to translate it because no one cares about comedies. And I understand that sentiment from hard-line kung fu fans; when I first got into Hong Kong films in the ’90s, I never would have watched this film either. But in my adulthood, as I branched out my Hong Kong viewings into genres beyond action, my appreciation for the cinema of Hong Kong grew exponentially. It’s hard to explain, but broadening my Hong Kong horizons made everything seem richer and more vibrant. The genre that surprised and delighted me the most was the Hong Kong comedy, whether it was a Hui Brothers film or a Lunar New Year movie. I saw a side of Hong Kong I had never known, and it was a joy.
In this way, I felt that I was uniquely interested in The Reckless Cricket, and that if I didn’t translate it, no one would. I wanted to bring the film forward so it could be seen and judged on its own, but also as a contextual piece surrounding the well-known traditional martial arts films of the Shaw Studio. Perhaps the sensible thing to do would have been to hire a translator to create the subtitles, but when I look back at everything I went through and learned about Cantonese along the way, I wouldn’t trade the experience for anything. It definitely would’ve been quicker (and probably better! 🙂 ), but I would’ve robbed myself of an experience I’ll never forget. I gained a true appreciation for the work that goes into subtitling and translating a film. I learned that listening to Chinese is a skill, and I have an appreciation for Cantonese that I never could’ve gained otherwise. Not to mention all the fun Cantonese sayings I learned!
I put hours upon hours into this translation over the course of the last 16 months, with the end goal of releasing it to the public. But like everything I do, it’s more for me than anyone else. I truly enjoyed creating this translation and learning some Cantonese along the way, and that’s what matters to me the most. I expect people to watch the movie and wonder why I would spend so long on it. And that’s OK, but my answer is that I enjoyed doing it. I saw an opportunity to make a previously unavailable Shaw Brothers movie, directed by Kuei Chih-Hung, available with English subtitles for the first time since its theatrical release in 1979. Whether you like the movie or not, I’m happy it’s out there now for you to watch.
II. Finding the Film and Translation Experiments
It’s kind of meaningless to anyone other than me, but I want to explain the sequence of events that led me to this film and translating it. I never expected to be creating subtitles for any movie, let alone one that wasn’t in my native language. A few years ago, while researching upcoming films in my chronological Shaw Brothers project, I searched for the lost film The Reckless Cricket. I had done this a few times before over the years, but it had been a while and things have a way of finding their way onto the Internet, no matter how improbable. So I looked, and I found it! It was in an incredibly meager resolution, and the video was plastered with all kinds of Chinese scrolling ticker ads and watermarks. But whatever, I had a copy. A year or so went by, and I repeated the process. This time I found a version that was slightly bigger and it didn’t have any scrolling ads! Yay! At this point, 1979 in my review series was still a ways off, as my caregiving duties were increasing, my mental health was decreasing, and my ability to keep the reviews flowing dried up.
Let’s say another year or two went by (I don’t remember), and I found myself once again searching for a better copy. My search was rewarded, as I turned up a nice-looking copy in a fairly normal resolution. All three copies had Chinese subs burned into the video. (As a side note, about halfway through doing the translation, I found a “4K” version that was the best looking yet, so that’s what’s coming with the subs.) Now that I had a good copy, I started looking into OCR (Optical Character Recognition) for Chinese text. I was familiar with the process from my years of library work. I have a couple of subtitle programs that can perform OCR, but none were capable of handling Chinese text. After a few months of trying various solutions, I found a Google Chrome plug-in called Copyfish that did what I was looking for. It wasn’t always exact, but it was close enough for me to start work on OCRing the on-screen Chinese subtitles into workable text that I could translate. I started doing this in Sept 2023, and worked on it here and there until November 2023, OCRing about 20 minutes of the movie.
Around this point, I looked up the film in the appendix of the HKFA’s The Shaw Screen book and learned that the original language was Cantonese. The best-quality version I had been working with was definitely in Mandarin, but for some reason I had kept the other two versions I found over the years. Lo and behold, one of first two was in Cantonese (I don’t remember which, and I since deleted them)! In a stroke of luck I had downloaded a Cantonese version completely unbeknownst to me, and for some reason kept the file even after finding better quality versions. (I should note here that while this Cantonese version was unavailable for years, I recently searched again and it has now shown up on YouTube, but it has watermarks, etc. similar to the first version I found!) Anyway, I ripped the audio from my file and added it to the nice copy of the film. Now with this new development, I realized that the burnt-in subtitles did not exactly match the Cantonese track, and in that moment this whole endeavor was born. The only thing stopping me was that Whisper, the speech-to-text AI program with its functionality embedded into Subtitle Edit, was unable to handle Cantonese.
III. The Initial Whisper
Around this same time — mid-2023 — I became aware that Simon of the website The 14 Amazons was creating GPT-Subtrans, a program specifically designed to translate subtitles using AI language models. Initially, I hoped to use this program to quickly translate the OCRed Chinese text into English. Simon’s program was made with films in mind, so it is able to deliver better quality overall than just sticking one line at a time into Google Translate or similar. While I was attempting the OCR, Simon continued development of his program, adding a GUI and making it easier to use. After mentioning what I was attempting, he suggested trying Whisper from within Subtitle Edit. I replied that I had tried it, but that Whisper didn’t support Cantonese. Simon informed me that the latest model had just released, and that it had added Cantonese support. So I updated Subtitle Edit to the latest version and sure enough Cantonese was there.
The initial whispering of the film took about 90 minutes, if I remember correctly. When it was done, it had created subtitles with Chinese text that spanned the entire movie. At this moment I thought I was halfway done! Hahaha, if only! I took that subtitle file and fed it into Simon’s program. After fixing some issues preventing the file from processing properly, I had an English subtitle file. I started up the movie to see how great it was. Unfortunately, it became clear pretty quickly that there was a missing link. It was in English, but it was off. Either Whisper was not hearing correctly, or the AI translation models at use in Subtrans were incorrectly processing the Chinese. Turns out it was kind of both.
At the time I first whispered the audio into text, Whisper was very inconsistent in delivering spoken Cantonese text. It delivered written Cantonese most of the time, which is closer to Mandarin. In theory, for simply creating English subtitles this shouldn’t be a problem, as long as Whisper was delivering text that translated to what was being said. But to accept written Cantonese from Whisper would also mean that it eliminated much of the spoken Cantonese slang I wanted to preserve in my translation, defeating the purpose. As I delved deeper into the generated text, I found that Whisper was also mishearing lots of characters, or inserting the wrong character for a sound. As someone who does not speak Cantonese, you might think that the program designed for this process would be better than my untrained ear, but it was clear something was wrong. The intro to The Reckless Cricket helped to clue me in, as it features on-screen text and a narrator.
Along with this, the translation of the initial whisper by Subtrans offered some lines that made absolutely no sense. The one that springs to mind is from the film’s intro. The grandfather of Reckless Cricket was a doctor and at the end of his story, he says 磅水 to his patient. The translation of this was “Release the water!” Huh? That makes no sense. I tried to figure out what he was saying, because it sounded to me like Whisper had the characters correct. On CantoDict, I found that 磅水 literally means weigh water, but it’s slang for Pay up! Water, in general, is slang for money, too. In discovering this, I realized that not only was Whisper mishearing some things, Subtrans was going to be unable to process Cantonese slang correctly. I realized that the only way forward with the translation was to manually check these issues, correcting them as I went along.
IV. The Main Translation
At this point, I started the lengthy process of going line-by-line through what Whisper created, figuring out what it got right, correcting what it got wrong, and manually translating the results with the help of dictionaries. This entire process took me roughly 13 months (Mar 2024 — Apr 2025). The first half of the movie took the longest, with the first 45 min of 89 done at the end of Dec 2024 (nine months). This section of the film is a little more dialogue-heavy (800 of roughly 1500 lines), but mostly it took so long because during this phase I was training my “Natural Intelligence model” in Cantonese listening and understanding. I was also slowly developing a method of translation that I used going forward, and that was a process unto itself.
Once I had the method and a foundation of Cantonese ability in place, it all started to go much quicker. This coincided with Whisper’s ability to more often deliver spoken Cantonese sometime in early 2025. Ah, we were learning together, how cute! I completed the second half of the movie (roughly 700 lines) in only four months, with an additional six weeks to go back through the whole thing and make sure everything was as good as I could get it. That gets me up to where I am now, typing up this “How & Why” thing that no one will read!
V. The Method & The Tools
My basic, evolved method to translate each line follows below. It wasn’t so much a linear process, as it was an organic flow between multiple steps as problems presented themselves. But I’ve done my best to put it into a linear form here.
1. With the initial Whisper text and Subtrans translation in place, I asked Subtitle Edit to re-whisper the line at hand.
There are a number of different “engines” of Whisper (no, I don’t know exactly what that means), and the ones that delivered the best results for me were Purfview’s Faster Whisper and CPP cuBLAS. So I would ask for a re-Whisper with Purfview’s Faster Whisper, and then with CPP cuBLAS. Sometimes I would have to do this a number of times to get it to return something workable.
In doing this, I found that Whisper would “hallucinate” at times, returning something wildly and obviously wrong. This is a “known issue” with AI models. These hallucinations became less frequent as the work went on, with Whisper eventually returning nothing or a “stock phrase” of 我哋返嚟喇 that I came to instantly recognize as “I have no idea.” The Chinese doesn’t actually mean that — it means “We are back” — but it did in practice. I chose to read it as “We are back, and we have brought nothing.” 😛 The hallucination rate was also improved when I learned how to finesse better results out of Whisper, in general by asking for it to deal with a larger block of time. Short or obscured lines continue to be a challenge for Whisper, but if you ask for a couple of lines together, it’ll often have a better result. In general, Whisper works better with some context. Hey, just like me!
2. Now with three variations of Whispered Chinese text, I copied them into CantoDict’s Parser. CantoDict has a neat feature that allows you to hover over characters to see meanings, as well as having robust slang entries that show if certain characters are grouped together. It also has entries marked as Cantonese only, or Mandarin only, etc., often with links to the corresponding text for the Cantonese version on Mandarin sayings, etc.
In virtually all cases, it wasn’t as simple as just looking at the results of CantoDict’s Parser and calling it good, though. That was only the beginning. As I mentioned before, Whisper would often insert the wrong character for the sound, such as 公夫 or 工夫 instead of 功夫 (all homophones, but only one is the correct gung fu!). So if something seemed off, I would search CantoDict’s dictionary for similar sounds using Jyutping. From my friend Uncle Jasper (who has studied Chinese for years), I knew that Chinese words were generally two characters. CantoDict allows you to search with only a part of a sound, so, for instance, if I heard what sounded like “bong seoi” but I wasn’t sure of the full sounds, I could search for that using something like “bo s.” It would return a long list of character combinations that included those sounds, but searching this way is very labor intensive and not ideal. If I knew one of the characters it was more helpful, and even more if I knew the tone. The burned-in Mandarin subs were often helpful in this regard, leading me to spoken Cantonese equivalents that matched up with what I was hearing.
3. Alongside doing this, I would plug the initial Whispered text (and any revisions), into MDBG’s translator to see what that came up with. MDGB gives you a translation of whatever you entered, along with a character-by-character look at the text. It also has a filter on the translation for Cantonese. This was very helpful, but it is more focused on Mandarin and has a seemingly limited knowledge of Cantonese slang.
In June 2024, Google Translate added Cantonese to its supported languages, and I immediately added it to this “machine translator” step. It was very helpful because it showed the Jyutping below the characters, so I could easily follow along while listening to whatever line I was focusing on. Google also has a text-to-speech function which allowed me to easily hear whatever I had entered and compare.
The major drawback to Google’s Cantonese translation (and maybe all of their translations?), is that it is powered by AI and it hallucinates A LOT! Its first instinct is to attempt to figure out what you’re trying to say and deliver that, instead of just straight translating the characters you’ve provided it with. If you click on the English results, it shows the corresponding Chinese text it translated, and it’s almost always very different to whatever you initially entered. With that in mind, I mostly used Google Translate as a quick way to see a sentence as a string of Jyutping, and to hear the characters spoken in sequence. It seemed to have a limited knowledge of Cantonese slang, too.
4. While doing these steps, I also used the Google Translate app on my phone to OCR the on-screen Mandarin subs. I put those into MDBG on my phone and used it as a reference, because, as I discovered, the two dubs were generally saying similar things but just in their own dialect. So, for instance, this step might give me a word in Mandarin that means “take a look at something”, like 看看 hon hon, but in Cantonese they say 睇下 tai haa. In terms of simple translation, this is ultimately unnecessary for someone simply looking to create English subtitles. It became very important to me to do this, though, as I was trying to retain as much Cantonese flavor as possible. The Cantonese slang would often reveal interesting word constructions or a subtle difference in feel.
5. Again, while going through steps 2–4, I would also search for words and slang in the Pleco app. Pleco is a Chinese dictionary app that allows you to customize it with different, additional dictionaries. I added on all the free Cantonese dictionaries. Pleco is more stringent in its searching than CantoDict, so if you search for a pair of characters it only would return results where they are the first two characters. For instance, if I searched for caa zaa it would return 茶渣 (used tea leaves), but not 爛茶渣 laan caa zaa, the slang term for women over 30.
6. Alongside those methods, I also used my copy of A Dictionary of Cantonese Slang by Christopher Hutton & Kingsley Bolton. I referred to this a lot during the first half of the translation. I ultimately focused more on CantoDict & Pleco’s slang entries, as this helped speed things up and they had most of what the book had. I used the book more as a third step, as needed. I found a couple of unique things in the book, and it was an essential part of the learning process.
I also had some other books I used to help me better understand Cantonese towards the beginning. These were: A Glossary of Common Cantonese Colloquial Expressions by Simon Siu-hing So, and Cantonese: A Comprehensive Grammar 2nd Edition by Stephen Matthews and Virginia Yip.
7. As much as I tried, it was nearly impossible for me to listen and keep up with the spoken Cantonese in the film. Even with the Jyutping in front of me, I found it to be incredibly daunting at first. It was just so fast… if only I could slow it down. So at some point early on, I separated the film’s audio into its own MP3, and I loaded it into my trusty Cool Edit Pro 2.0. Cool Edit is an old audio editing software I’ve been using for 20+ years. I think the first version I used was on Windows 3.1! Cool Edit was eventually bought by Adobe, and it became Adobe Audition. But it’s not quite the same, especially now that you can no longer buy Adobe products by themselves. In any case, I always transfer my copy of Cool Edit Pro to any new computer I get because I’ve never found anything that works as intuitively as it does.
So with Cool Edit I was able to slow down whatever line I was listening to, and it allowed me to hear things that were previously unavailable to my ears! It helped me to hear the Cantonese more fully, and while I’m still a total Cantonese novice, my ear has become much more attuned than it was previously. I can now listen at regular speed and catch a lot more of the characters. Not that I can understand them, just that I can mostly hear the separation between them now.
8. Through some back-and-forth combination of these varied tools, I eventually arrived at a line in Cantonese that looked and sounded like what I was hearing. I looked at what Google & MDBG said it translated to, and taking into account any slang sayings found in the dictionaries, I created my own translation that fit within the film and the context of the scene. The six-week re-check really helped things like this come together more fully, as the first time through I had no idea where the movie was headed so I made some mistakes.
VI. Conclusion
That’s about it. I tried my best to lay out exactly how and why I set myself down this path, and hopefully my intentions come through to anyone who dares to read this whole thing.
I’d also like to say that the stacked subs are what I consider the main subs for two reasons. The first is that all Hong Kong films carry stacked subs theatrically, and they are how I first experienced subtitled Hong Kong films on bootleg VHS tapes. So I loved the idea of creating a stacked-sub experience to recreate the old style visually. With that in mind, the spoken Cantonese subs are not meant to be used as Chinese subs for Mandarin speakers. Probably don’t need to say that, but I just wanted to put it out there. The second is that for anyone looking to decipher what I’ve done, the Chinese text is already right there for you to examine.
I’ve also created an English-only version of the subs for anyone who would prefer that, and the file also includes the Mandarin dub if you’re inclined to compare/contrast (although the subs are not timed for this version and will be off). I also adjusted the audio to be more in-sync to the action; it was very out-of-sync and distracting in the second half. I also cropped the video to remove the burned-in subs, allowing my stacked subs to look much better without other Chinese subs behind them.
Thanks for reading, and hopefully you enjoy the movie! If you have any questions about Whisper or anything, I will try to answer them.
If you read this long thing and still want more Reckless Cricket content, make sure to check out my review for the film, as well as my Translation Notes for my subtitles. You can also find the movie with my subs by visiting here, or watch it on YouTube!
(All the Reckless Cricket lobby cards are from the HKMDB)