|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.hadoop.hive.serde2.RegexSerDe
public class RegexSerDe
RegexSerDe uses regular expression (regex) to deserialize data. It doesn't support data serialization. It can deserialize the data using regex and extracts groups as columns. In deserialization stage, if a row does not match the regex, then all columns in the row will be NULL. If a row matches the regex but has less than expected groups, the missing groups will be NULL. If a row matches the regex but has more than expected groups, the additional groups are just ignored. NOTE: Obviously, all columns have to be strings. Users can use "CAST(a AS INT)" to convert columns to other types. NOTE: This implementation is using String, and javaStringObjectInspector. A more efficient implementation should use UTF-8 encoded Text and writableStringObjectInspector. We should switch to that when we have a UTF-8 based Regex library.
Field Summary | |
---|---|
static org.apache.commons.logging.Log |
LOG
|
Constructor Summary | |
---|---|
RegexSerDe()
|
Method Summary | |
---|---|
Object |
deserialize(org.apache.hadoop.io.Writable blob)
Deserialize an object out of a Writable blob. |
ObjectInspector |
getObjectInspector()
Get the object inspector that can be used to navigate through the internal structure of the Object returned from deserialize(...). |
SerDeStats |
getSerDeStats()
Returns statistics collected when serializing |
Class<? extends org.apache.hadoop.io.Writable> |
getSerializedClass()
Returns the Writable class that would be returned by the serialize method. |
void |
initialize(org.apache.hadoop.conf.Configuration conf,
Properties tbl)
Initialize the HiveDeserializer. |
org.apache.hadoop.io.Writable |
serialize(Object obj,
ObjectInspector objInspector)
Serialize an object by navigating inside the Object with the ObjectInspector. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final org.apache.commons.logging.Log LOG
Constructor Detail |
---|
public RegexSerDe()
Method Detail |
---|
public void initialize(org.apache.hadoop.conf.Configuration conf, Properties tbl) throws SerDeException
Deserializer
initialize
in interface Deserializer
initialize
in interface Serializer
conf
- System propertiestbl
- table properties
SerDeException
public ObjectInspector getObjectInspector() throws SerDeException
Deserializer
getObjectInspector
in interface Deserializer
SerDeException
public Class<? extends org.apache.hadoop.io.Writable> getSerializedClass()
Serializer
getSerializedClass
in interface Serializer
public Object deserialize(org.apache.hadoop.io.Writable blob) throws SerDeException
Deserializer
deserialize
in interface Deserializer
blob
- The Writable object containing a serialized object
SerDeException
public org.apache.hadoop.io.Writable serialize(Object obj, ObjectInspector objInspector) throws SerDeException
Serializer
serialize
in interface Serializer
SerDeException
public SerDeStats getSerDeStats()
Deserializer
getSerDeStats
in interface Deserializer
getSerDeStats
in interface Serializer
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |