Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Machine Learning by (19k points)

I am trying to get Apple's sample Core ML Models that were demoed at the 2017 WWDC to function correctly. I am using the GoogLeNet to try and classify images (see the Apple Machine Learning Page). The model takes a CVPixelBuffer as an input. I have an image called imageSample.jpg that I'm using for this demo. My code is below:

        var sample = UIImage(named: "imageSample")?.cgImage

        let bufferThree = getCVPixelBuffer(sample!)

        let model = GoogLeNetPlaces()

        guard let output = try? model.prediction(input: GoogLeNetPlacesInput.init(sceneImage: bufferThree!)) else {

            fatalError("Unexpected runtime error.")

        }

        print(output.sceneLabel)

I am always getting the unexpected runtime error in the output rather than an image classification. My code to convert the image is below:

func getCVPixelBuffer(_ image: CGImage) -> CVPixelBuffer? {

        let imageWidth = Int(image.width)

        let imageHeight = Int(image.height)

        let attributes : [NSObject:AnyObject] = [

            kCVPixelBufferCGImageCompatibilityKey : true as AnyObject,

            kCVPixelBufferCGBitmapContextCompatibilityKey : true as AnyObject

        ]

        var pxbuffer: CVPixelBuffer? = nil

        CVPixelBufferCreate(kCFAllocatorDefault,

                            imageWidth,

                            imageHeight,

                            kCVPixelFormatType_32ARGB,

                            attributes as CFDictionary?,

                            &pxbuffer)

        if let _pxbuffer = pxbuffer {

            let flags = CVPixelBufferLockFlags(rawValue: 0)

            CVPixelBufferLockBaseAddress(_pxbuffer, flags)

            let pxdata = CVPixelBufferGetBaseAddress(_pxbuffer)

            let rgbColorSpace = CGColorSpaceCreateDeviceRGB();

            let context = CGContext(data: pxdata,

                                    width: imageWidth,

                                    height: imageHeight,

                                    bitsPerComponent: 8,

                                    bytesPerRow: CVPixelBufferGetBytesPerRow(_pxbuffer),

                                    space: rgbColorSpace,

                                    bitmapInfo: CGImageAlphaInfo.premultipliedFirst.rawValue)

            if let _context = context {

                _context.draw(image, in: CGRect.init(x: 0, y: 0, width: imageWidth, height: imageHeight))

            }

            else {

                CVPixelBufferUnlockBaseAddress(_pxbuffer, flags);

                return nil

            }

            CVPixelBufferUnlockBaseAddress(_pxbuffer, flags);

            return _pxbuffer;

        }

        return nil

    }

I got this code from a previous StackOverflow post (last answer here). I recognize that the code may not be correct, but I have no idea how to do this myself. I believe that this is the section that contains the error. The model calls for the following type of input: Image<RGB,224,224>

1 Answer

0 votes
by (33.1k points)

You can simply use this new vision API to solve your model:

import Vision

import CoreML

let model = try VNCoreMLModel(for: MyCoreMLGeneratedModelClass().model)

let request = VNCoreMLRequest(model: model, completionHandler: myResultsMethod)

let handler = VNImageRequestHandler(url: myImageURL)

handler.perform([request])

func myResultsMethod(request: VNRequest, error: Error?) {

    guard let results = request.results as? [VNClassificationObservation]

        else { fatalError("huh") }

    for classification in results {

        print(classification.identifier, // the scene label

              classification.confidence)

    }

}

For more details on CV Pixel, Machine Learning Course would be quite beneficial.

Hope this answer helps.

Browse Categories

...