Sample of Voice Authentication

Have you ever thought of using your voice instead of plain text string to get authenticated?

It’s not science fiction, it’s called biometric authentication and, some company as Google, are going to moving on in the next years.

In this article, I’m going to illustrate it using an open source library and a little example.

 Inspiring by this ted talk:

Starting from the concept of “Voice is unique as our fingerprint”, I tried to use this concept for authentication layer where, instead of inserting a plain text, I attempted to recognize a user by his voice.

I’m speaking about Biometric Authentication which is a very large topic and, voice, is only one parameter of a larger set.

Google is working on this technology, just wait and see the results: (https://www.theguardian.com/technology/2016/may/24/google-passwords-android).

I adopted the “Java Speaker Recognition Framework” to recognize the voice from a source like microphone.

Unfortunately, I must give you the first bad news. The library is almost out of date and not future update are scheduled.

However, I think this example could still have a good value for future implementation.

Briefly, I’ll try to explain, in a very easy way (I’m not a sound engineer), how it works.

A sound wave looks like this:

concy

This is a wave for a 128Kbps voice last 3 seconds.

The following is the same word pronounced by the same person:

concy_2

And, now, this is the same word by, this time, pronounced by another person:

matta2

Have you got the difference between the tree waves? Even at first glance, the first and the second waves look quite similar rather than the last.

The idea is to get samples from these waves and then compare these samples from the original recorded by the authenticated user.

Let’s see some code.


@Controller
public class RecognitoController {

	private Voice voice;

	private static final Log logger = LogFactory
			.getLog(RecognitoController.class);

	@RequestMapping(method = RequestMethod.POST, value = "/uploadnewuser")
	public ModelAndView uploadNewUserWave(
			@RequestParam("file") MultipartFile file,
			@RequestParam("username") String username,
			HttpServletRequest request) throws IOException,
			UnsupportedAudioFileException {
		try {
			byte[] valueDecoded = file.getBytes();

			String filePath = request.getServletContext().getRealPath(
					"/audio/" + username + ".wav");

			logger.info("Upload User:" + username + " File:" + filePath);

			File sample = new File(filePath);

			FileOutputStream os = new FileOutputStream(sample);
			os.write(valueDecoded);
			os.close();

			return new ModelAndView("viewJson", "processed", username + " - OK");
		} catch (Exception e) {
			logger.error("Errore", e);
			return new ModelAndView("viewJson", "processed", "Error!"
					+ e.getMessage());
		}

	}

	@RequestMapping(method = RequestMethod.POST, value = "/upload")
	public ModelAndView uploadWave(@RequestParam("file") MultipartFile file,
			@RequestParam("username") String username,
			HttpServletRequest request) throws IOException,
			UnsupportedAudioFileException {

		byte[] valueDecoded = file.getBytes();

		SimpleDateFormat dateFormat = new SimpleDateFormat("HH:mm:ss");
		Date date = new Date();
		String today = dateFormat.format(date); // 2013/10/15 16:16:39
		today = today.replace(":", "_");

		String filePath = request.getServletContext().getRealPath(
				"/audio/req_" + today + ".wav");

		File sample = new File(filePath);

		FileOutputStream os = new FileOutputStream(sample);
		os.write(valueDecoded);
		os.close();

		String filePathMatch = request.getServletContext().getRealPath(
				"/audio/" + username);

		Voice voiceMatch = new Voice(username, filePathMatch);

		List<MatchResult<String>> matches = voiceMatch.recognito
				.identify(sample);

		logger.info("");
		logger.info("****************");
		logger.info("Verify User:" + username + " File:" + filePath);
		logger.info("");

		logger.info("Sample " + filePath);

		StringBuilder sb = new StringBuilder();

		MatchResult<String> result = matches.get(0);

		if (result.getDistance() < 0.1)
			sb.append("OK");
		else
			sb.append("KO");

		logger.info("Identified: " + result.getKey() + " distance of "
				+ result.getDistance() + " with " + result.getLikelihoodRatio()
				+ "% positive about it...");

		return new ModelAndView("viewJson", "processed", sb.toString());

	}

	@RequestMapping(method = RequestMethod.GET, value = "/getuserlist")
	public ModelAndView getuserlist(HttpServletRequest request)
			throws IOException {
		return new ModelAndView("viewJson", "processed", voice.getOptionUser());
	}

	public Voice getVoice() {
		return voice;
	}

	public void setVoice(Voice voice) {
		this.voice = voice;
	}
}

The UploadNewUser method gets the wave file from the user interface and then store it into the storage. This is used for the new Users.

The uploadWave method compares the sample record stored with the one uploaded by the user when he attempts the authentication.

at the highlight 80, I set the threshold to decide whether the new track is “similar” with the one stored.

Obviously, the lower is the threshold the more accurate must be the uploaded voice in order to match the authentication.

The Voice class is used to create and storage the compared voice. This is the code:


public class Voice {

	public Recognito<String> recognito = new Recognito<>(8000.0f);

	private static final Log logger = LogFactory.getLog(Voice.class);	

	protected String pathVoiceRecorded;

	public Voice() {
	}

	public Voice(String name, String path)
	{
		try {			
				recognito.createVoicePrint(name,
						new File(path));

				logger.info(name + " - " + path);
			
		} catch (UnsupportedAudioFileException e) {
			logger.error("Errore", e);
		} catch (IOException e) {
			logger.error("Errore", e);
		}
	}
	
	public Voice(Map<String, String> voicePrint) {
		try {
			for (Map.Entry<String, String> entry : voicePrint.entrySet()) {

				recognito.createVoicePrint(entry.getKey(),
						new File(entry.getValue()));

				logger.info(entry.getKey() + " - " + entry.getValue());
			}
		} catch (UnsupportedAudioFileException e) {
			logger.error("Errore", e);
		} catch (IOException e) {
			logger.error("Errore", e);
		}
	}

	public Voice(String pathVoiceRecorded) {

		this.pathVoiceRecorded = pathVoiceRecorded;

		try {
			File[] files = new File(pathVoiceRecorded).listFiles();
			// If this pathname does not denote a directory, then listFiles()
			// returns null.

			for (File file : files) {
				if (file.isFile()) {
					recognito.createVoicePrint(file.getName(), new File(
							pathVoiceRecorded + file.getName()));
					logger.info(file.getName() + " - "
							+ pathVoiceRecorded + file.getName());
				}
			}
		} catch (UnsupportedAudioFileException e) {
			logger.error("Errore", e);
		} catch (IOException e) {
			logger.error("Errore", e);
		}
	}

	public String getOptionUser() {

		File[] files = new File(pathVoiceRecorded).listFiles();
		// If this pathname does not denote a directory, then listFiles()
		// returns null.

		StringBuilder sb = new StringBuilder();

		for (File file : files) {
			if (file.isFile()) {
				if (!file.getName().startsWith("req_")) {
					sb.append("<option value=\"" + file.getName() + "\">"
							+ file.getName() + "</option>");
					logger.info(file.getName());
				}
			}
		}
		return sb.toString();
	}
}

The constructor of the Recognito object set the sample bit rate at 8Khz (8.000 Hz).

A bigger sample rate means bigger file stored and more accurate comparison between the track. I didn’t found any advantages increasing this value.

The configuration is an easy Spring Rest application


<beans ...>

	<context:annotation-config />

	<mvc:annotation-driven />
	<mvc:resources mapping="/static/index.html" location="/static/index.html" />

	<bean class="org.springframework.web.servlet.view.BeanNameViewResolver" />

	<bean id="viewJson" class="org.springframework.web.servlet.view.json.MappingJacksonJsonView" />

	<bean id="RecognitoController" class="com.bitsinharmony.recognito.webapp.RecognitoController">
		<property name="voice" ref="voiceDb" />
	</bean>
	
	<bean id="voiceDb" class="com.bitsinharmony.recognito.webapp.Voice">
		<constructor-arg index="0" type="java.lang.String" value="/home/recognito/audio/" />
	</bean>

	<bean id="multipartResolver" class="org.springframework.web.multipart.commons.CommonsMultipartResolver">

		<!-- setting maximum upload size -->
		<property name="maxUploadSize" value="100000000" />

	</bean>
</beans>

The user interface for testing

ui

And the final result (unfortunately “Ko).

Many thanks to Amaury Crickx for the Recognito library and Subin Siby for the javascript library.

The complete source code is available under gitHub.

My solution at https://github.com/MarcoGhise/BiometricAuthentication.

Recognito library at https://github.com/amaurycrickx/recognito.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s