#Roma #fediverso ci sei? Lo vogliamo creare e supportare? Romanisti, noi siamo ovunque nel mondo e sicuramente siamo sparsi nel fediverso. Creiamo il #romafediverso
#Roma #fediverso ci sei? Lo vogliamo creare e supportare? Romanisti, noi siamo ovunque nel mondo e sicuramente siamo sparsi nel fediverso. Creiamo il #romafediverso
@thelinuxEXP I really like Speech Note! It's a fantastic tool for quick and local voice transcription in multiple languages, created by @mkiol
It's incredibly handy for capturing thoughts on the go, conducting interviews, or making voice memos without worrying about language barriers. The app uses strictly locally running LLMs, and its ease of use makes it a standout choice for anyone needing offline transcription services.
I primarily use #WhisperAI for transcription and Piper for voice, but many other models are available as well.
It is available as flatpak and https://github.com/mkiol/dsnote
#TTS #transcription #TextToSpeech #translator translation #offline #machinetranslation #sailfishos #SpeechSynthesis #SpeechRecognition #speechtotext #nmt #linux-desktop #stt #asr #flatpak-applications #SpeechNote
I generally dislike puns in article titles but 'careless whisper', about the algorithmic harms of hallucinations in OpenAI's whisper tool, is pretty good https://doi.org/10.1145/3630106.3658996 #facct24
Shocking findings: 38% of hallucinations include explicit harms (violence, inaccurate associations, false assertions of authority) and they are more likely to occur in e.g. aphasic speech — so really this is a bias amplifier #ASR #LanguageTechnology
#Aegon producten gaan onder de vlag van #asr vallen. Dus ik dacht, ik maak daar vast een account aan. Hier wat stappen die ik zojuist uitvoerde.
E-mailadres bevestigd.
Wachtwoord ingesteld.
Inloggen.
Melding: "Verkeerde login/wachtwoord combinatie".
Wachtwoord reset.
Inloggen.
Zelfde melding.
Nog een keer wachtwoord reset, nu met ander wachtwoord.
Inloggen
Zelfde melding, en nu geblokkeerd.
I installed @homeassistant with #VoicePreview over the weekend, using the Home Assistant Green.
It uses Whisper's base model under the hood, and I've got an Australian accent. My PhD research showed Whisper performs reasonably well on Australian accents, but I set up an automation for "start a writing session for {minutes} minutes".
And apparently I'm starting a _rotting_ session.
#ASR #ASRfail #HomeAssistant
Two posts from British Library Oral History Archivist Charlie Morgan on the challenges of AI for oral history: key questions and theoretical and practical issues for automatic speech recognition (ASR) tools and chatbots:
https://blogs.bl.uk/digital-scholarship/2024/12/the-challenges-of-ai-for-oral-history-key-questions.html
https://blogs.bl.uk/digital-scholarship/2024/12/the-challenges-of-ai-for-oral-history-theoretical-practical-issues.html
#ASR #OralHistory #ML #AI
@edsu do you know about the recently launched @AI4LAM Speech to Text Working Group: https://sites.google.com/view/ai4lam/working-groups-and-chapters? Everyone welcome #ASR #ML
"Coqui, a conversational AI startup, on Wednesday (January 3, 2023), announced that it is shutting down its operation ...
[It] specialises in building open source models and applications in the area of quick voice cloning, text-to-voice, etc. The former employees of Mozilla, left the company after it stopped developing their own Speech-to-text engine, DeepSpeech to begin Coqui.”
#KLKrithika, 2024
Grant Heinrich from the #NFSA talking the first 140 days of using #Whisper #ASR for #transcription of archival material - that understands Australian English, including slang.
Begins with key principles of maintaining trust and forefronting creators.
CC-BY
Each quarter, when the new @mozilla #CommonVoice #dataset is released, I do a #dataviz using @observablehq of its #metadata coverage, across all 100+ languages, based on the JSON summary that is part of the release.
Some of my observations from the v18 release are:
#Catalan (ca) now has a larger dataset than English, based on the number of audio recordings (including validated and yet-to-be-validated recordings). It’s also an interesting dataset because the number of recordings per unique contributor is relatively low (around 80). This means it’s likely to have a high diversity of speakers in the dataset, which is useful for building #ASR models that generalise well to many speakers.
Catalan also appears to have the highest percentage of audio recordings by older speakers - e.g. speakers in their forties, fifties and older. Again, this highlights the diversity of speakers in the Catalan dataset.
Although it’s very early to see any trends from the decision by Common Voice to expand the range of options for gender identity, we are starting to see some data being tagged with the new options that are available. For example, in #Uyghur (ug), we now have data tagged as “do not wish to say”. I don’t want to draw connections between the geopolitical situation in that area and the desire of data contributors not to provide demographic data which may in some way identify them without more evidence, but I think it’s telling that the first use of these expanded metadata categories appears in a language that is spoken in a contested geography.
Similarly, it’s very early to identify trends in sentence domain classification - as most of the sentences that do have a domain tag are labelled “general”, although “health_care” sentences are occurring frequently in languages such as #Albanian (sq).
#Bangla (Bengali) (bn) continues to have a very large number of yet-to-be-validated audio recordings. Due to this, the train split for Bangla is quite small.
#Dholuo (luo), a language spoken in Kenya and Tanzania, is an outlier in terms of the number of distinct data contributors to the dataset - this language has a very high average number of contributions for per contributor. This is often seen in languages that are new to Common Voice, before they have been able to recruit more contributors. Dholuo has nearly 5 million speakers.
The language with the highest average utterance duration is by far #Icelandic (is) at over 7 seconds. This may be because Icelandic has many words with several syllables, which take longer to pronounce. Consider "the cat sat on the mat" in English, cf "kötturinn sat á mottunni" in Icelandic.
Big thanks to all data contributors in this release for your donated utterances, and to Dmitrij Feller, @jessie, Gina Moape, EM Lewis-Jong and the team for all your efforts.
What are your thoughts? What conclusions do you draw?
https://observablehq.com/@kathyreid/mozilla-common-voice-v18-dataset-metadata-coverage
Like many other technologists, I gave my time and expertise for free to #StackOverflow because the content was licensed CC-BY-SA - meaning that it was a public good. It brought me joy to help people figure out why their #ASR code wasn't working, or assist with a #CUDA bug.
Now that a deal has been struck with #OpenAI to scrape all the questions and answers in Stack Overflow, to train #GenerativeAI models, like #LLMs, without attribution to authors (as required under the CC-BY-SA license under which Stack Overflow content is licensed), to be sold back to us (the SA clause requires derivative works to be shared under the same license), I have issued a Data Deletion request to Stack Overflow to disassociate my username from my Stack Overflow username, and am closing my account, just like I did with Reddit, Inc.
https://policies.stackoverflow.co/data-request/
The data I helped create is going to be bundled in an #LLM and sold back to me.
In a single move, Stack Overflow has alienated its community - which is also its main source of competitive advantage, in exchange for token lucre.
Stack Exchange, Stack Overflow's former instantiation, used to fulfill a psychological contract - help others out when you can, for the expectation that others may in turn assist you in the future. Now it's not an exchange, it's #enshittification.
Programmers now join artists and copywriters, whose works have been snaffled up to create #GenAI solutions.
The silver lining I see is that once OpenAI creates LLMs that generate code - like Microsoft has done with Copilot on GitHub - where will they go to get help with the bugs that the generative AI models introduce, particularly, given the recent GitClear report, of the "downward pressure on code quality" caused by these tools?
While this is just one more example of #enshittification, it's also a salient lesson for #DevRel folks - if your community is your source of advantage, don't upset them.
Folks, I'm starting my post-#PhD job search low-key on the side while I write up my #thesis.
I have an odd collection of skills - #Linux, #Python, #Jupyter, #pandas, #DevRel, and I've done a lot of work in team leadership and management, and have led a multi-million $ not for profit in the past. Keynote speaker.
My speciality is #voice and #speech AI, more on the #ASR side with models like #Whisper.
I'm looking for something that harnesses all of these skills - and it will be a senior role with senior pay, given my experience, qualifications and proven capability. I have time and will be discerning about my next step.
Job titles that might fit here would be Senior Research Engineer, Engineering Lead, Lead AI Engineer or similar.
Looking for fully remote work, with one day a fortnight max in #Melbourne, AU. If you don't believe in #RemoteWork or #WFH, we're not a good fit.
Super keen on something full time rather than splitting my attention over multiple part-time roles.
Looking to start around August, so a fair amount of lead time.
Keen on organisations that have strong values alignment - #FAIR and #CARE data use, #EthicalAI, AI for social good.
No crypto, no web3, no deepfake stuff.
Check out my LinkedIn for more info on my background:
https://www.linkedin.com/in/kathyreid/
A warm welcome to #Mastodon to @thorstenvoice - one of the best communicators about #ASR #TTS and #STT in the world. His #OpenSource #German #Deutsche dataset is in use in many places.
Please make Thorsten welcome
For folks who work in #DataScience, what's the easiest way for me to to calculate the #CosineSimilarity of two strings? I'm looking at sklearn cosine_similarity first.
Related to hallucination detection in #ASR - low cosine similarity indicative of hallucination.
#SeabrookNuclearPlant faces ongoing challenge of managing concrete degradation
By Angeljean Chiaramida, Seacoastonline
July 13, 2023
"Agency officials also discussed the problem that’s dogged the power plant’s concrete for more than a decade: alkali-silica reaction. NextEra, they noted, will have to bring resources to bear on a continual basis to address #ASR as Seabrook Station ages to comply with the conditions of its operating license. A 2023 report shows concrete degradation has expanded from seven to 10 structures at the Seabrook plant."
#SeabrookNuclearPlant #SeabrookStation #C10 #SeaLevelRise #NoNukes #RethinkNotRestart #ClimateCrisis #Flooding #ClimateCatastrophe #WaterIsLife #AlkaliSilicaReaction
#GlobalSeaLevelRise
So, it seems the #SeabrookNuclearPlant survived the recent storms without incident, but if there was a problem, there is NO WAY nearby residents would have been able to evacuate. I came across this letter to the #NRC from the group #NoMoreFukushimas expressing their concerns about #ClimateChange and #NuclearPlants in 2012!
Concerns regarding the #SeabrookStation
No More Fukushimas letter to the NRC.
The Honorable Allison M. Macfarlane, Chair
Nuclear Regulatory Commission
11555 Rockville Pike
Rockville, MD 20852
November 8, 2012
Dear Chairwoman Macfarlane:
We appreciated receiving a Nuclear Regulatory Commission (NRC) response to the August 28, 2012, letter that we sent to the NRC concerning Seabrook Station relicensing. The NRC's response (October 17, 2012) came from Dennis Morey, Chief, Project Manager 1, Projects Branch Division of License Renewal, Office of Nuclear Reactor Regulation (Docket No. 50-443).
In our letter, we highlighted a concern openly discussed NRC meeting April 26, 2012, on Seabrook relicensing held in Hampton, New Hampshire. Data indicates that due to climate change there could be an increase in #SeaLevels and storm surges that would affect the Seabrook plant. Obviously, the flooding of the Seabrook plant campus should be a cause for concern, especially since it the flooding is projected to occur within the timeframe of the relicensing period, 2030-2050.
In his response to our letter, Mr. Morey categorically rejected the idea that this rising sea level information was of any relevance to the relicensing of the Seabrook plant:
"Regarding your concerns about the current design-basis flood level calculations.... please note that these issues are not part of the NRC's review of a license renewal application. A license renewal review is not a re-review of the facility licensing basis; rather, it is focused on managing
the age-related degradation of passive systems, structures, and components to ensure they will fulfill their safety-related functions, as specified in the current licensing basis.
"The NRC has multiple processes to evaluate the adequacy of current plant operations and licensing bases. Should the NRC become aware at any time of information calling into question the continued safe operation of any nuclear power plant, including Seabrook Station, the NRC will take the appropriate actions as part of the agency's ongoing safety oversight, regardless of
whether those plants have sought or are seeking a renewed license."
In the twists and turns of bureaucratic thinking, Mr. Morey may be technically correct that climate-
change-related flooding is not an "age-related" deterioration artifact. But, Mr. Morey seems to brush off the fact that new global climate conditions could completely reconfigure the safety profile of the plant. We believe that whether or not climate-change-related flooding falls within "design-basis flood calculations" is a hairsplitting issue for bureaucrats. However, for those of who live near the plant it's a major safety issue. Therefore, if necessary, we respectfully recommend that NRC modify its relicensing concerns to include global climate change/rising sea levels in its license renewal framework.
Furthermore, Mr. Morey must know that the NRC has identified "alkali-silica reaction (ASR)" as a potential long-term threat to the reliability of the Seabrook plant and that structural degradation due to
#ASR is currently under the NRC's relicensing review. The flooding water will obviously raise levels of saltwater saturation, which will accelerate concrete degradation so, on that basis alone, the flooding should be within the Seabrook relicensing purview.
Finally, since Mr. Morey did not identify the steps the NRC plans to take to address flooding at the Seabrook plant, we surmise that the NRC does not consider flooding due to sea-level rise to be a problem. Our concern has escalated since researchers at the Shorenstein Asia-Pacific Research Center at Stanford University in an October 31, 2012, piece in the Washington Post reported that they had conducted a study that assessed the vulnerability of #NuclearPlants flooding around the world.
The Stanford researchers collected information on plant height, #SeaWall height and the location of emergency power generators for 89 nuclear plants that lie next to water. They compared this to
historical information on high waves triggered by various sources, such as #earthquakes, #landslides and #hurricanes. The study found that the U.S. plants most vulnerable to inundation are the Salem and #HopeCreek plants on the New Jersey / #Delaware border; the #Millstone plant in Connecticut; and the Seabrook plant in New Hampshire (italics added). We strongly urge you to contact the researchers and obtain this invaluable information from them directly.
That said, we ask the NRC-as we did in our August letter-to review the risk that rising sea levels, #StormSurges or increased groundwater saturation of concrete poses to residents who live in the vicinity of the Seabrook nuclear power plant. As we have stated, we believe it is entirely appropriate to do so within the purview of the license renewal process. But, in the spirit of public safety, which we believe should be paramount-we urge the NRC to use whatever regulatory tools are needed to investigate this critical issue.
Sincerely yours,
Bruce Skud and Joanna Hammond
Co-founders, No More Fukushimas!
Here's a #BookReview I wrote of Tobias Dengel and Karl Weber's "The Sound of the Future" - which claims that #voice #technology like #ASR, #TTS and #synthetic #speech are transformative, and that businesses should start to invest heavily in them.
While the book covers a lot of ground, it leaves many more critical questions unanswered in its unabashed techno-optimism.
Last week, as part of my #PhD program at the #ANU School of #cybernetics, I gave my final presentation, which is a summary of my methods and #research findings. I covered my interview work, the #dataset documentation analysis work I've been doing and my analysis work around #accents in @mozilla's #CommonVoice platform.
There were some insightful and thought-provoking questions from my panel and audience members, and of course - so many ideas for future research inquiry!
A huge thanks to my panel, chaired so well by Professor Alexandra Zafiroglu, to Dr Elizabeth Williams, my meticulous, methodical and always-encouraging Primary Supervisor, and to my co-supervisors Dr Jofish Kaye and Dr Paul Wong 黃仲熙 for their deep expertise in #HCI and #data respectively.
Similarly, a huge thank you to my #PhD cohort - Charlotte Bradley, Tom Chan, Danny Bettay and Sam Backwell - as well as the other cohorts in the School - for your encouragement and intellectual journeying.
Does anyone here know how to use #kaldi or other speech recognition stuff?
I tried whisper.cpp but it apparently can only use OpenAI's models, so it's not an option, on ethical grounds.
I want to implement cross-platform voice commands into #Freespace Open, as it currently only works in Windows with the Microsoft SAPI.
(boosts welcome)
For folks who work with #ASR #SpeechRecognition, specifically #Whisper from #OpenAI - I have heard some anecdotal evidence of transcription with the medium-en model returning paragraphs of "junk" content, like weather reports and adverts for golfing supplies.
I have three confirmed reports from transcripts of interviews of unrelated topics, and am curious if there are other (as yet unreported) instances of similar?
If so, please let me know - DM for email address.
Boosts appreciated.