Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Azure by (45.3k points)

I am working on Azure ML implementation on text analytics with NLTK, the following execution is throwing.

AssertionError: 1 columns passed, passed data had 2 columns\r\nProcess returned with non-zero exit code 1

Below is the code

# The script MUST include the following function,

# which is the entry point for this module:

# Param<dataframe1>: a pandas.DataFrame

# Param<dataframe2>: a pandas.DataFrame

def azureml_main(dataframe1 = None, dataframe2 = None):

    # import required packages

    import pandas as pd

    import nltk

    import numpy as np

    # tokenize the review text and store the word corpus

    word_dict = {}

    token_list = []

    nltk.download(info_or_id='punkt', download_dir='C:/users/client/nltk_data')

    nltk.download(info_or_id='maxent_treebank_pos_tagger', download_dir='C:/users/client/nltk_data')

    for text in dataframe1["tweet_text"]:

        tokens = nltk.word_tokenize(text.decode('utf8'))

        tagged = nltk.pos_tag(tokens)

      # convert feature vector to dataframe object

    dataframe_output = pd.DataFrame(tagged, columns=['Output'])

    return [dataframe_output]

Throwing Error like this:

 dataframe_output = pd.DataFrame(tagged, columns=['Output'])

I suspect this to be the tagged data type passed to data frames, can someone let me know the right approach to add this to the data frame.

1 Answer

0 votes
by (16.8k points)
edited by

Please try this code, this should help:

Looking for Azure material from basics! Refer to this video on Azure provided by Intellipaat:

dataframe_output = pd.DataFrame(tagged, columns=['Output', 'temp'])

Browse Categories

...