Posts

DBPedia Natural Language Interface Using Huggingface Transformer

I prototyped a simple natural language question answering demo in about 90 minutes. I accept a query like “where does Bill Gates work?”, find the likely URI for Bill Gates, collect some comment text for this DBPedia entity, and then pass the original query to the transformer model with the “context” being the comment text collected via a SPARQL query. I run this on Google Colab. Note that I saved my Jupyter Notebook as a python file that is in the listing below. Note the use of ! to run shell commands (e.g., !pip install transformers). # -*- coding: utf-8 -*- """DbPedia QA system.ipynb Automatically generated by Colaboratory. Original file is located at      https://colab.research.google.com/drive/1FX-0eizj2vayXsqfSB2ONuJYG8BaYpGO **DBPedia Question Answering System** Copyright 2021 Mark Watson. All rights reserved. License: Apache 2 """ !pip install transformers !pip install SPARQLWrapper from transformers import pipeline qa = pipeli

I have a new job helping to build a Knowledge Graph at Olive AI

 I retired (my last job was Master Software Engineer and the manager of a deep learning team at Capital One) a year ago April and was enjoying time with friends and family, doing personal research in hybrid AI, lots of writing, and volunteering at our local food bank. I stopped my volunteer work with COVID-19 and welcomed the opportunity last month to start work at Olive AI  working on a very strong Knowledge Graph team. I believe in their mission and the work and the people are great! It is refreshing to leave the deep learning field, at least for a while. My heart is in developing stronger AI that can explain its actions and adapt flexibly to help people in their lives. I always take a humans-first stand on technology. AI systems should help us get our work done efficiently and remove tedium, allow us more time for creative activities, and generally enjoy our own humanity.

I have tried to take advantage of extra time during the COVID-19 pandemic

My wife Carol and I have been practicing social distancing and wearing masks for shopping for over 5 months now. Welcome to the new normal and a crazy world in which entertaining and seeing friends is done by meeting in people's yards and everyone bringing their own "meal in a bag." I enjoy writing so I have been updating my recent books, starting with Loving Common Lisp, or the Savvy Programmer's Secret Weapon and  A Lisp Programmer Living in Python-Land: The Hy Programming Language . These are free to read online and licensed with Creative Commons Share and Share Alike, No Commercial Reuse, so you can also find copies on the web (hopefully up to date copies!). Last month I started a much larger project: I have not updated my book Practical Artificial Intelligence Programming With Java since the fourth edition was published in 2013. I have discarded a lot of older material like exert systems, and have three new chapters on the semantic web and also a new chapter on

Custom built SBCL and using spaCy and TensorFlow in Common Lisp

Here are some of my of my recent notes that might save you some time, or teach you a new trick. I have had good results using the py4cl library if I wrap API calls to TensorFlow or spaCy in a short Python library that calls Python libraries and returns results in simple types like strings and dictionaries. I just committed a complete example (Python library and Common Lisp client code) to the public repo for my book  Loving Common Lisp, or the Savvy Programmer's Secret Weapon that will be added to the next edition of my book. Here is a link to the subdirectory with this new example in my repo: https://github.com/mark-watson/loving-common-lisp/tree/master/src/spacy I frequently make standalone executable programs using SBCL and I just noticed a great tip from  Zach Beane for compressing the size of standalone executables. Start with rebuilding SBCL from source to add the compression option; get the source code and: ./make.sh --with-sb-thread --with-sb-core-compression sh in

Protecting oneself from surveillance capitalism

As an author I find occasional use of Facebook and Twitter to be useful for “broadcasting” notifications of my new books, open source projects, etc. I also find gmail to be useful for some types of email. Still, I do like to take a few easy steps to push back a little against the free use of my web behavioral data to profit corporations who I don’t do business with (and those I do): Use ProtonMail as my primary email Use Firefox on my Linux and macOS laptops with individual containers for Google, FaceBook, etc. On iOS devices, favor browsing with private tabs. Use a VPN when I am traveling and when I  need to use public WiFi  Limit use of my gmail address to a backup email and as a junk email address. For online purchases from Amazon, etc. use a secure email service that does not use the contents of your email to market to you and as data to sell to 3rd parties. Frequently delete all cookies from web browsers that you use. Use private browsing windows for routine use of the

My hopes and predictions for the next 10 years

Hello everyone,  I wish everyone a happy and healthy new year! Here are my predictions for the next ten years: Wearable devices like the Apple Watch will become widely used and because of user pushback we will see company’s like Apple, Google, Toshiba, Huawei, Samsung, etc. start to support standards that allow, for example, an Apple Watch to interact with a Samsung TV. Further, I expect a single personal device (watch or phone) to be for most users a connection hub that interacts with public kiosks, displays, input devices, etc. Deep learning architectures and techniques will rapidly improve and will continue to rule the world, at least for a while. I expect at least one new dramatic paradigm shift for AI beyond current deep learning, reinforcement learning, etc. models. The world economies will get hit hard and wealth will be in a larger part measured in terms of ownership of water and food production, manufacturing, technology IP, and hopefully hard assets like gold, silver,

GANs and other deep learning models for cooking recipes

I retired this spring after working on artificial intelligence projects since the 1980s. Freedom from having to work on large projects for other people and companies is liberating and frees up time for thinking about new ideas. Currently I am most interested in deep learning models for generating and evaluating recipes - for now I am using a GAN model (which I am calling RecipeGAN). When I managed a deep learning team at Capital One, I used GANs to synthesize data. During a Saturday morning quiet-time hacking sprint the first month at my new job, I had the idea to take an example program SimpleGAN that generated MINST digits and instead generate numeric spreadsheet data (using the Wisconsin Cancer Data Set that I had previously used in my books as example machine learning data). I was really surprised how well this worked: I could generate fake Wisconsin cancer data, train a classification model on the fake data, and get classification prediction accuracy on real data samples that wa